Paul Roebber

(University of Wisconsin-Milwaukee)

Applications of Machine Learning to Atmospheric Science

What Meteo Colloquium GR Homepage UG
When Jan 17, 2018
from 03:30 pm to 04:30 pm
Where 112 Walker
Contact Name Steven Greybush
Contact email
Add event to calendar vCal
iCal

Paul Roebber UWM

Paul J. Roebber

Distinguished Professor, Atmospheric Science Group
Department of Mathematical Sciences
University of Wisconsin at Milwaukee

Even my 88-year old mother has heard terms like “Big Data”, “Data Analytics” and “Machine Learning.” But stripping away the marketing noise, what is it, really, and why should physical scientists care about it? The ability to leverage data to improve understanding has always been important, but is becoming increasingly so as data becomes more readily available and the need increases to extract some measure of value from its rising volume. Data analytics provides the methodology. The requirements for a practitioner in this field are application-oriented math and statistics knowledge allied with substantive domain expertise. Since the software tools needed to perform the necessary analyses are not mature, and often must be custom-designed, programming skills are also important. Machine Learning enters into this discussion as one type of of tool that can be used in performing this task.

Multiple linear regression (MLR) has seen wide use in economics and affiliated fields, as it is a useful technique for assessing the relationships between variables and thereby developing understanding from data. MLR represents an early, simple application of data analytics to weather prediction in the form of Model Output Statistics (MOS), which seeks to map numerical weather prediction model output to observations. More sophisticated techniques, like artificial neural networks (ANN), including its extension to Deep Learning, or various other machine-learning approaches such as Evolutionary Programming, are now gaining currency in many fields, and have excellent potential for use in atmospheric sciences.

A straightforward example of an atmospheric science question that can be answered with data analytics is “Can we forecast daily peak electricity load given available atmospheric inputs?” Rather than build a comprehensive numerical model that encompasses both the meteorology and the built-environment energy usage that results, using data analytics, we would start by collecting relevant data and building a model using MLR or an ANN. Given the curse of dimensionality, which requires an exponential increase in the length of time-series data as the number of variables considered increases, we would need to know something about energy usage to guide our choice of data to collect. The built model would confirm that the most predictive variable by far is temperature, and in the warm season, apparent temperature (the combination of temperature and humidity), but that other information such as time-of-day, wind speed and direction, cloud cover, and snow on the ground are also relevant in some situations, and likewise, that changing energy usage patterns over time need to be accounted for in the analysis. In this seminar, I will provide specific examples in the meteorological domain using MLR, multiple logistic regression, ANN, and a special emphasis on Evolutionary Programs.