Navega David, Coelho Catarina, Vicente Ricardo, Ferreira Maria Teresa, Wasterlain Sofia, Cunha Eugénia
Forensic Sciences Centre (CENCIFOR), Largo da Sé Nova,, 3000-213, Coimbra, Portugal,
Int J Legal Med. 2015 Sep;129(5):1145-53. doi: 10.1007/s00414-014-1050-9. Epub 2014 Jul 23.
In forensic anthropology, ancestry estimation is essential in establishing the individual biological profile. The aim of this study is to present a new program--AncesTrees--developed for assessing ancestry based on metric analysis. AncesTrees relies on a machine learning ensemble algorithm, random forest, to classify the human skull. In the ensemble learning paradigm, several models are generated and co-jointly used to arrive at the final decision. The random forest algorithm creates ensembles of decision trees classifiers, a non-linear and non-parametric classification technique. The database used in AncesTrees is composed by 23 craniometric variables from 1,734 individuals, representative of six major ancestral groups and selected from the Howells' craniometric series. The program was tested in 128 adult crania from the following collections: the African slaves' skeletal collection of Valle da Gafaria; the Medical School Skull Collection and the Identified Skeletal Collection of 21st Century, both curated at the University of Coimbra. The first step of the test analysis was to perform ancestry estimation including all the ancestral groups of the database. The second stage of our test analysis was to conduct ancestry estimation including only the European and the African ancestral groups. In the first test analysis, 75% of the individuals of African ancestry and 79.2% of the individuals of European ancestry were correctly identified. The model involving only African and European ancestral groups had a better performance: 93.8% of all individuals were correctly classified. The obtained results show that AncesTrees can be a valuable tool in forensic anthropology.
在法医人类学中,祖先估计对于建立个体生物学特征至关重要。本研究的目的是介绍一个新开发的程序——AncesTrees,用于基于测量分析评估祖先。AncesTrees依靠一种机器学习集成算法——随机森林,对人类头骨进行分类。在集成学习范式中,会生成多个模型并共同用于得出最终决策。随机森林算法创建决策树分类器的集成,这是一种非线性和非参数分类技术。AncesTrees中使用的数据库由来自1734个人的23个颅骨测量变量组成,这些人代表六个主要祖先群体,选自豪威尔斯的颅骨测量系列。该程序在以下藏品的128个成人颅骨上进行了测试:瓦莱达加法里亚的非洲奴隶骨骼藏品;科英布拉大学管理的医学院头骨藏品和21世纪已识别骨骼藏品。测试分析的第一步是对数据库中的所有祖先群体进行祖先估计。测试分析的第二阶段是仅对欧洲和非洲祖先群体进行祖先估计。在第一次测试分析中,75%的非洲血统个体和79.2%的欧洲血统个体被正确识别。仅涉及非洲和欧洲祖先群体的模型表现更好:所有个体中有93.8%被正确分类。所得结果表明,AncesTrees在法医人类学中可以成为一个有价值的工具。