Suppr超能文献

推进考古学中的预测建模:对大阶梯-埃斯卡兰特国家纪念地的回归和机器学习方法的评估。

Advancing predictive modeling in archaeology: An evaluation of regression and machine learning methods on the Grand Staircase-Escalante National Monument.

机构信息

Department of Anthropology, University of Utah, Salt Lake City, Utah, United States of America.

Archaeological Center, University of Utah, Salt Lake City, Utah, United States of America.

出版信息

PLoS One. 2020 Oct 1;15(10):e0239424. doi: 10.1371/journal.pone.0239424. eCollection 2020.

Abstract

Predictive models are central to both archaeological research and cultural resource management. Yet, archaeological applications of predictive models are often insufficient due to small training data sets, inadequate statistical techniques, and a lack of theoretical insight to explain the responses of past land use to predictor variables. Here we address these critiques and evaluate the predictive power of four statistical approaches widely used in ecological modeling-generalized linear models, generalized additive models, maximum entropy, and random forests-to predict the locations of Formative Period (2100-650 BP) archaeological sites in the Grand Staircase-Escalante National Monument. We assess each modeling approach using a threshold-independent measure, the area under the curve (AUC), and threshold-dependent measures, like the true skill statistic. We find that the majority of the modeling approaches struggle with archaeological datasets due to the frequent lack of true-absence locations, which violates model assumptions of generalized linear models, generalized additive models, and random forests, as well as measures of their predictive power (AUC). Maximum entropy is the only method tested here which is capable of utilizing pseudo-absence points (inferred absence data based on known presence data) and controlling for a non-representative sampling of the landscape, thus making maximum entropy the best modeling approach for common archaeological data when the goal is prediction. Regression-based approaches may be more applicable when prediction is not the goal, given their grounding in well-established statistical theory. Random forests, while the most powerful, is not applicable to archaeological data except in the rare case where true-absence data exist. Our results have significant implications for the application of predictive models by archaeologists for research and conservation purposes and highlight the importance of understanding model assumptions.

摘要

预测模型是考古学研究和文化资源管理的核心。然而,由于训练数据集较小、统计技术不足以及缺乏理论洞察力来解释过去土地利用对预测变量的反应,考古学中应用预测模型往往不够充分。在这里,我们解决了这些批评,并评估了四种广泛应用于生态建模的统计方法——广义线性模型、广义加性模型、最大熵和随机森林——预测形成期(2100-650 BP)考古遗址在大阶梯-埃斯卡兰特国家纪念碑中的位置的预测能力。我们使用独立于阈值的度量标准——曲线下面积(AUC)和依赖于阈值的度量标准,如真技能统计量,来评估每种建模方法。我们发现,由于经常缺乏真实的缺失位置,大多数建模方法都难以处理考古数据集,这违反了广义线性模型、广义加性模型和随机森林的模型假设,以及它们的预测能力(AUC)的度量标准。最大熵是这里测试的唯一一种能够利用伪缺失点(基于已知存在数据推断的缺失数据)并控制景观代表性不足的方法,因此,当目标是预测时,最大熵是最适合常见考古数据的建模方法。基于回归的方法在预测不是目标的情况下可能更适用,因为它们基于成熟的统计理论。随机森林虽然功能最强大,但除非存在真实的缺失数据,否则不适用于考古数据。我们的结果对考古学家为研究和保护目的应用预测模型具有重要意义,并强调了理解模型假设的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/74f9/7529236/652ad070d81a/pone.0239424.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验