Department of Geosciences and Geography, University of Helsinki, PO Box 64, FI-00014, Helsinki, Finland.
Environnements et Paléoenvironnements, Océaniques et Continentaux, UMR 5805, Université de Bordeaux, Pessac, France.
Sci Rep. 2019 Nov 1;9(1):15805. doi: 10.1038/s41598-019-52293-4.
We test several quantitative algorithms as palaeoclimate reconstruction tools for North American and European fossil pollen data, using both classical methods and newer machine-learning approaches based on regression tree ensembles and artificial neural networks. We focus on the reconstruction of secondary climate variables (here, January temperature and annual water balance), as their comparatively small ecological influence compared to the primary variable (July temperature) presents special challenges to palaeo-reconstructions. We test the pollen-climate models using a novel and comprehensive cross-validation approach, running a series of h-block cross-validations using h values of 100-1500 km. Our study illustrates major benefits of this variable h-block cross-validation scheme, as the effect of spatial autocorrelation is minimized, while the cross-validations with increasing h values can reveal instabilities in the calibration model and approximate challenges faced in palaeo-reconstructions with poor modern analogues. We achieve well-performing calibration models for both primary and secondary climate variables, with boosted regression trees providing the overall most robust performance, while the palaeoclimate reconstructions from fossil datasets show major independent features for the primary and secondary variables. Our results suggest that with careful variable selection and consideration of ecological processes, robust reconstruction of both primary and secondary climate variables is possible.
我们测试了几种定量算法,将其作为北美和欧洲化石花粉数据的古气候重建工具,同时使用了基于回归树集成和人工神经网络的经典方法和较新的机器学习方法。我们专注于重建次要气候变量(这里是 1 月温度和年水平衡),因为与主要变量(7 月温度)相比,其对生态的影响相对较小,这给古重建带来了特殊挑战。我们使用一种新颖且全面的交叉验证方法来测试花粉气候模型,使用 h 值为 100-1500km 的一系列 h 块交叉验证来运行。我们的研究说明了这种可变 h 块交叉验证方案的主要优势,因为空间自相关的影响最小化了,而随着 h 值的增加进行的交叉验证可以揭示校准模型中的不稳定性,并近似模拟与较差现代类似物的古重建中面临的挑战。我们为主要和次要气候变量都实现了性能良好的校准模型,其中提升回归树提供了最稳健的整体性能,而来自化石数据集的古气候重建则显示了主要和次要变量的主要独立特征。我们的结果表明,通过仔细选择变量并考虑生态过程,可以对主要和次要气候变量进行稳健的重建。