Predicting the weather two to six weeks in advance, i.e., subseasonal forecasting, is critical to several sectors of society and poses many formidable challenges to the machine learning community. At this forecast horizon, physics-based dynamical models are of limited skill, and the complex dynamics between local weather and global climate variables make it hard to identify strong signals in the data. Furthermore, even though machine learning methods show great promise in advancing current state-of-the-art models, obtaining data for this weather regime is often challenging: they come from different data sources, at different temporal and spatial resolutions, and require expert knowledge to distill features that drive subseasonal phenomena. To help researchers develop the next generation of subseasonal models, AER Principal Scientist Judah Cohen, along with university and Microsoft colleagues have created SubseasonalClimateUSA, a curated dataset tailored to benchmarking subseasonal forecasts in the U.S. We also benchmark several models in our dataset, from deep learning methods to operational dynamical models, as well as recent bias-correction algorithms. Overall, these benchmarks suggest simple and effective ways of extending the accuracy of current operational models. Our dataset is regularly updated and available as a Python package at https://github.com/microsoft/subseasonal_data/.
Figure. Example of SubseasonalClimateUSA observations and dynamical model forecasts.
Citation: SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and Benchmarking
S. Mouatadid, P. Orenstein, G. Flaspohler, M. Oprescu, J. Cohen, F. Wang, S. Knight, M. Geogdzhayeva, S. Levang, E. Fraenkel, L. Mackey
NeurIPS 2023 Datasets and Benchmarks
arXiv:2109.10399