Ji Chaonan, Fincke Tonio, Benson Vitus, Camps-Valls Gustau, Fernández-Torres Miguel-Ángel, Gans Fabian, Kraemer Guido, Martinuzzi Francesco, Montero David, Mora Karin, Pellicer-Valero Oscar J, Robin Claire, Söchting Maximilian, Weynants Mélanie, Mahecha Miguel D
Remote Sensing Centre for Earth System Research (RSC4Earth), Leipzig University, Leipzig, 04103, Germany.
Institute for Earth System Science and Remote Sensing, Leipzig University, Leipzig, 04103, Germany.
Sci Data. 2025 Jan 25;12(1):149. doi: 10.1038/s41597-025-04447-5.
With climate extremes' rising frequency and intensity, robust analytical tools are crucial to predict their impacts on terrestrial ecosystems. Machine learning techniques show promise but require well-structured, high-quality, and curated analysis-ready datasets. Earth observation datasets comprehensively monitor ecosystem dynamics and responses to climatic extremes, yet the data complexity can challenge the effectiveness of machine learning models. Despite recent progress in deep learning to ecosystem monitoring, there is a need for datasets specifically designed to analyse compound heatwave and drought extreme impact. Here, we introduce the DeepExtremeCubes database, tailored to map around these extremes, focusing on persistent natural vegetation. It comprises over 40,000 globally sampled small data cubes (i.e. minicubes), with a spatial coverage of 2.5 by 2.5 km. Each minicube includes (i) Sentinel-2 L2A images, (ii) ERA5-Land variables and generated extreme event cube covering 2016 to 2022, and (iii) ancillary land cover and topography maps. The paper aims to (1) streamline data accessibility, structuring, pre-processing, and enhance scientific reproducibility, and (2) facilitate biosphere dynamics forecasting in response to compound extremes.
随着极端气候的频率和强度不断上升,强大的分析工具对于预测其对陆地生态系统的影响至关重要。机器学习技术显示出前景,但需要结构良好、高质量且经过整理的适合分析的数据集。地球观测数据集全面监测生态系统动态以及对极端气候的响应,然而数据的复杂性可能会对机器学习模型的有效性构成挑战。尽管深度学习在生态系统监测方面取得了近期进展,但仍需要专门设计用于分析复合热浪和干旱极端影响的数据集。在此,我们引入了DeepExtremeCubes数据库,该数据库专门围绕这些极端情况进行映射,重点关注持久的自然植被。它包含超过40000个全球采样的小数据立方体(即微型立方体),空间覆盖范围为2.5×2.5千米。每个微型立方体包括(i)哨兵 - 2 L2A图像,(ii)ERA5 - Land变量以及生成的涵盖2016年至2022年的极端事件立方体,以及(iii)辅助土地覆盖和地形地图。本文旨在(1)简化数据获取、结构化、预处理流程,并提高科学可重复性,以及(2)促进对复合极端事件响应的生物圈动态预测。