Tian Xuemeng, de Bruin Sytze, Simoes Rolf, Isik Mustafa Serkan, Minarik Robert, Ho Yu-Feng, Şahin Murat, Herold Martin, Consoli Davide, Hengl Tomislav
OpenGeoHub, Doorwerth, Netherlands.
Laboratory of Geo-Information Science and Remote Sensing, Wageningen University and Research, Wageningen, Netherlands.
PeerJ. 2025 Jul 14;13:e19605. doi: 10.7717/peerj.19605. eCollection 2025.
This article describes a comprehensive framework for soil organic carbon density (SOCD, kg/m) modeling and mapping, based on spatiotemporal random forest (RF) and quantile regression forests (QRF). A total of 45,616 SOCD observations and various Earth observation (EO) feature layers were used to produce 30 m SOCD maps for the EU at four-year intervals (2000-2022) and four soil depth intervals (0-20 cm, 20-50 cm, 50-100 cm, and 100-200 cm). Per-pixel 95% probability prediction intervals (PIs) and extrapolation risk probabilities are also provided. Model evaluation indicates good overall accuracy ( = 0.63 and CCC = 0.76 for hold-out independent tests). Prediction accuracy varies by land cover, depth interval and year of prediction with the worst accuracy for shrubland and deeper soils 100-200 cm. The PI validation confirmed effective uncertainty estimation, though with reduced accuracy for higher SOCD values. Shapley analysis identified soil depth as the most influential feature, followed by vegetation, long-term bioclimate, and topographic features. While pixel-level uncertainty is substantial, spatial aggregation reduces uncertainty by approximately 66%. Detecting SOCD changes remains challenging but offers a baseline for future improvements. Maps, based primarily on topsoil data from cropland, grassland, and woodland, are best suited for applications related to these land covers and depths. We recommend that users interpret the maps in conjunction with local knowledge and consider the accompanying uncertainty and extrapolation risk layers. All data and code are available under an open license at https://doi.org/10.5281/zenodo.13754343 and https://github.com/AI4SoilHealth/SoilHealthDataCube/.
本文介绍了一种基于时空随机森林(RF)和分位数回归森林(QRF)的土壤有机碳密度(SOCD,kg/m)建模与制图综合框架。总共使用了45616个SOCD观测值和各种地球观测(EO)特征层,以四年为间隔(2000 - 2022年)和四个土壤深度间隔(0 - 20厘米、20 - 50厘米、50 - 100厘米和100 - 200厘米)生成欧盟的30米SOCD地图。还提供了逐像素的95%概率预测区间(PIs)和外推风险概率。模型评估表明总体精度良好(留一法独立测试中 = 0.63,CCC = 0.76)。预测精度因土地覆盖、深度间隔和预测年份而异,灌丛地和100 - 200厘米的深层土壤精度最差。PI验证证实了有效的不确定性估计,尽管对于较高的SOCD值精度有所降低。Shapley分析确定土壤深度是最具影响力的特征,其次是植被、长期生物气候和地形特征。虽然像素级不确定性很大,但空间聚合可将不确定性降低约66%。检测SOCD变化仍然具有挑战性,但为未来改进提供了基线。主要基于农田、草地和林地表土数据的地图最适合与这些土地覆盖和深度相关的应用。我们建议用户结合当地知识解读地图,并考虑附带的不确定性和外推风险图层。所有数据和代码均可在https://doi.org/10.5281/zenodo.13754343和https://github.com/AI4SoilHealth/SoilHealthDataCube/上以开放许可获取。