Suppr超能文献

招募骨干人员——使用基于蒙特卡罗算法模拟和扩充古人类学数据的方法。

Recruiting a skeleton crew-Methods for simulating and augmenting paleoanthropological data using Monte Carlo based algorithms.

机构信息

Department of Cartographic and Land Engineering, Higher Polytechnic School of Avila, University of Salamanca, Ávila, Spain.

Department of Geology, Facultad de Ciencia y Tecnología, Universidad del País Vasco-Euskal Herriko Unibertsitatea (UPV/EHU), Leioa, Spain.

出版信息

Am J Biol Anthropol. 2023 Jul;181(3):454-473. doi: 10.1002/ajpa.24754. Epub 2023 May 17.

Abstract

OBJECTIVES

Data collection is a major hindrance in many types of analyses in human evolutionary studies. This issue is fundamental when considering the scarcity and quality of fossil data. From this perspective, many research projects are impeded by the amount of data available to perform tasks such as classification and predictive modeling.

MATERIALS AND METHODS

Here we present the use of Monte Carlo based methods for the simulation of paleoanthropological data. Using two datasets containing cross-sectional biomechanical information and geometric morphometric 3D landmarks, we show how synthetic, yet realistic, data can be simulated to enhance each dataset, and provide new information with which to perform complex tasks with, in particular classification. We additionally present these algorithms in the form of an R library; AugmentationMC. We also use a geometric morphometric dataset to simulate 3D models, and emphasize the power of Machine Teaching, as opposed to Machine Learning.

RESULTS

Our results show how Monte Carlo based algorithms, such as the Markov Chain Monte Carlo, are useful for the simulation of morphometric data, providing synthetic yet highly realistic data that has been tested statistically to be equivalent to the original data. We additionally provide a critical overview of bootstrapping techniques, showing how Monte Carlo based methods perform better than bootstrapping as the data simulated is not an exact copy of the original sample.

DISCUSSION

While synthetic datasets should never replace large and real datasets, this can be considered an important advance in how paleoanthropological data can be handled.

摘要

目的

在人类进化研究的许多类型的分析中,数据收集是一个主要障碍。在考虑化石数据的稀缺性和质量时,这个问题是根本性的。从这个角度来看,许多研究项目受到可用数据量的限制,无法执行分类和预测建模等任务。

材料和方法

在这里,我们提出了使用基于蒙特卡罗的方法来模拟古人类学数据。使用包含两个数据集的交叉部分生物力学信息和几何形态学 3D 地标,我们展示了如何模拟合成但真实的数据,以增强每个数据集,并提供新的信息,以便执行复杂的任务,特别是分类。我们还以 R 库的形式呈现这些算法;增强 MC。我们还使用几何形态学数据集来模拟 3D 模型,并强调机器教学的力量,而不是机器学习。

结果

我们的结果表明,基于蒙特卡罗的算法(如马尔可夫链蒙特卡罗)对于形态计量数据的模拟非常有用,提供了合成但高度真实的数据,这些数据已经经过统计学测试,与原始数据等效。我们还提供了对引导技术的批判性概述,表明基于蒙特卡罗的方法比引导更好,因为模拟的数据不是原始样本的精确副本。

讨论

虽然合成数据集永远不应替代大型和真实数据集,但这可以被认为是如何处理古人类学数据的重要进展。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验