New Publication - Application of machine learning to improve reanalysis soil temperatures over the extratropical northern hemisphere

Herrington, T.C., Erler, A.R. & Fletcher, C.G. Application of machine learning to improve reanalysis soil temperatures over the extratropical northern hemisphere. Theor Appl Climatol157, 419 (2026). https://doi.org/10.1007/s00704-026-06338-0

We are pleased to share a recent publication by Herrington, Erler, and Fletcher, titled "Application of Machine Learning to Improve Reanalysis Soil Temperatures over the Extratropical Northern Hemisphere." The paper was recently published in Journal of Theoretical and Applied Climatology.

The study evaluates soil temperatures in two widely used high-resolution land-surface products, ERA5-Land and FLDAS, across the extratropical Northern Hemisphere and demonstrates how machine learning techniques can significantly reduce biases in soil temperature estimates. Using a Random Forest approach trained on more than 2,600 soil temperature stations, the researchers developed a bias-corrected soil temperature dataset covering the extratropical Northern Hemisphere from 1982–2023.

The resulting product substantially improves the accuracy of soil temperature estimates and provides a valuable new resource for hydrological modelling, climate research, and environmental monitoring. More accurate soil temperature information can help improve our understanding of land-surface processes, support climate impact assessments, and enhance the initialization and validation of hydrological models.

This work also supports the broader goals of the Canada1Water (C1W) project. As Canada’s first fully integrated 3D model of the national water cycle, C1W relies on high-quality climate and environmental datasets to improve our understanding of groundwater–surface water interactions and the impacts of climate change on water resources.

The study demonstrates how machine learning can be used to improve soil temperature estimates from high-resolution land-surface products such as ERA5-Land and FLDAS, creating more accurate datasets for environmental and hydrological applications. Improved soil temperature information is particularly important in northern regions, where warming temperatures and the degradation of permafrost are driving significant changes to groundwater flow, surface water dynamics, and the broader hydrological cycle.

As climate change continues to reshape Canada's northern landscapes, advances in soil temperature datasets help strengthen the scientific foundation for large-scale hydrological modelling efforts like C1W, providing researchers and decision-makers with better information to assess future water resource challenges and support climate adaptation planning.

Read the full publication by clicking the link below to learn more about the methodology, findings, and applications of this research.

Abstract

Reanalysis products provide spatially homogeneous coverage for a variety of climate variables. However, previous research has shown that soil temperature estimates in many reanalysis products have substantial biases; particularly in winter. Here we evaluate the performance of two products: ERA5-Land and FLDAS across the northern hemisphere and test a hierarchy of statistical techniques for bias correction. Both products provide high-resolution (~9km) estimates of vertically resolved soil temperature; however, ERA5-Land exhibits warm biases over permafrost regions, and a median RMSE of between 1.6K and 2.3K, while FLDAS exhibits cold biases, and a median RMSE of between 2.5K and 4.5K. Here we use multiple linear regression (MLR) and random forest regression (RF) for bias correction and compare them to mean bias subtraction (MBS) as a reference. The MLR and RF models employ 10 predictors, including soil depth, product soil temperature, air temperature, vegetation, snow cover, elevation, latitude and longitude. The RF model substantially outperforms MBS and MLR over all regions and latitudes, providing an average RMSE reduction (relative to the products’ soil temperatures) of 46% – 77% when the ground is snow-covered, and 56% – 64% during snow-free conditions. We introduce a bias-corrected soil temperature product, which provides gridded soil temperature data over the extratropical northern hemisphere between 1982 and 2023. This new data resource will be useful for a wide range of applications, including as an initialization condition for hydrological models, and as a tool to validate simulated soil temperatures.

Citation

Herrington, T.C., Erler, A.R. & Fletcher, C.G. Application of machine learning to improve reanalysis soil temperatures over the extratropical northern hemisphere. Theor Appl Climatol 157, 419 (2026). https://doi.org/10.1007/s00704-026-06338-0

Next
Next

Canada1Water New Publication and Data Release Notice: National Digital Terrain Model (C1W-DTM) Dataset