Improving Subseasonal Forecasting in the Western U.S. with Machine Learning
- Jessica Hwang ,
- Paulo Orenstein ,
- Judah Cohen ,
- Karl Pfeiffer ,
- Lester Mackey
2019 Knowledge Discovery and Data Mining |
Published by ACM
Sub-Seasonal Climate Forecast Rodeo challenge winner
下载 BibTexWater managers in the western United States (U.S.) rely on long-term forecasts of temperature and precipitation to prepare for droughts and other wet weather extremes. To improve the accuracy of these long-term forecasts, the U.S. Bureau of Reclamation and the National Oceanic and Atmospheric Administration (NOAA) launched the Subseasonal Climate Forecast Rodeo (opens in new tab), a year-long real-time forecasting challenge in which participants aimed to skillfully predict temperature and precipitation in the western U.S. two to four weeks and four to six weeks in advance. Here we present and evaluate our machine learning approach to the Rodeo and release our SubseasonalRodeo dataset, collected to train and evaluate our forecasting system.
Our system is an ensemble of two nonlinear regression models. The first integrates the diverse collection of meteorological measurements and dynamic model forecasts in the SubseasonalRodeo dataset and prunes irrelevant predictors using a customized multitask feature selection procedure. The second uses only historical measurements of the target variable (temperature or precipitation) and introduces multitask nearest neighbor features into a weighted local linear regression. Each model alone is significantly more accurate than the debiased operational U.S. Climate Forecasting System (CFSv2), and our ensemble skill exceeds that of the top Rodeo competitor for each target variable and forecast horizon. Moreover, over 2011-2018, an ensemble of our regression models and debiased CFSv2 improves debiased CFSv2 skill by 40-50% for temperature and 129-169% for precipitation. We hope that both our dataset and our methods will help to advance the state of the art in subseasonal forecasting.
论文与出版物下载
The SubseasonalRodeo Dataset
20 9 月, 2018
A benchmark dataset for training and evaluating subseasonal forecasting systems—systems predicting temperature or precipitation 2-6 weeks in advance—in the western contiguous United States.