Suppr超能文献

使用机器学习分类器的单核苷酸多态性面板用于白种人和四个东亚及东南亚人群的个体识别和血统归属。

A single nucleotide polymorphism panel for individual identification and ancestry assignment in Caucasians and four East and Southeast Asian populations using a machine learning classifier.

作者信息

Hwa Hsiao-Lin, Wu Ming-Yih, Lin Chih-Peng, Hsieh Wei Hsin, Yin Hsiang-I, Lee Tsui-Ting, Lee James Chun-I

机构信息

Department and Graduate Institute of Forensic Medicine, College of Medicine, National Taiwan University, No. 1, Sec. 1, Jen Ai Rd, Taipei, 100, Taiwan.

Department of Obstetrics and Gynecology, National Taiwan University Hospital, No. 7 Chung Shan S. Rd, Taipei, 100, Taiwan.

出版信息

Forensic Sci Med Pathol. 2019 Mar;15(1):67-74. doi: 10.1007/s12024-018-0071-y. Epub 2019 Jan 16.

Abstract

Single nucleotide polymorphism (SNP) profiling is an effective means of individual identification and ancestry inferences in forensic genetics. This study established a SNP panel for the simultaneous individual identification and ancestry assignment of Caucasian and four East and Southeast Asian populations. We analyzed 220 SNPs (125 autosomal, 17 X-chromosomal, 30 Y-chromosomal, and 48 mitochondrial SNPs) of the DNA samples from 563 unrelated individuals of five populations (89 Caucasian, 234 Taiwanese Han, 90 Filipino, 79 Indonesian and 71 Vietnamese) and 18 degraded DNA samples. Informativeness for assignment (In) was used to select ancestry informative SNPs (AISNPs). A machine learning classifier, support vector machine (SVM), was used for ancestry assignment. Of the 220 SNPs, 62 were individual identification SNPs (IISNPs) (51 autosomal and 11 X-chromosomal SNPs) and 191 were AISNPs (100 autosomal, 13 X-chromosomal, 30 Y-chromosomal, and 48 mitochondrial SNPs). The 51 autosomal IISNPs offered cumulative random match probabilities (cRMPs) ranging from 1.56 × 10 to 3.16 × 10 among these five populations. Using AISNPs with the SVM, the overall accuracy rate of ancestry inference achieved in the testing dataset between Caucasian, Taiwanese Han, and Filipino populations was 88.9%, whereas it was 70.0% between Caucasians and each of the four East and Southeast Asian populations. For the 18 degraded DNA samples with incomplete profiling, the accuracy rate of ancestry assignment was 94.4%. We have developed a 220-SNP panel for simultaneous individual identification and ethnic origin differentiation between Caucasian and the four East and Southeast Asian populations. This SNP panel may assist with DNA analysis of forensic casework.

摘要

单核苷酸多态性(SNP)分析是法医遗传学中进行个体识别和血统推断的有效手段。本研究建立了一个SNP面板,用于同时对高加索人群以及四个东亚和东南亚人群进行个体识别和血统判定。我们分析了来自五个群体(89名高加索人、234名台湾汉人、90名菲律宾人、79名印度尼西亚人和71名越南人)的563名无关个体以及18份降解DNA样本的DNA样本中的220个SNP(125个常染色体SNP、17个X染色体SNP、30个Y染色体SNP和48个线粒体SNP)。使用赋值信息量(In)来选择血统信息SNP(AISNP)。采用机器学习分类器支持向量机(SVM)进行血统判定。在这220个SNP中,62个是个体识别SNP(IISNP)(51个常染色体SNP和11个X染色体SNP),191个是AISNP(100个常染色体SNP、13个X染色体SNP、30个Y染色体SNP和48个线粒体SNP)。这51个常染色体IISNP在这五个群体中的累积随机匹配概率(cRMP)范围为1.56×10至3.16×10。使用AISNP和SVM,在测试数据集中,高加索人、台湾汉人和菲律宾人群之间血统推断的总体准确率为88.9%,而在高加索人与四个东亚和东南亚人群中的每一个群体之间,准确率为70.0%。对于18份分析不完整的降解DNA样本,血统判定的准确率为94.4%。我们开发了一个220-SNP面板,用于同时在高加索人群与四个东亚和东南亚人群之间进行个体识别和种族起源区分。该SNP面板可能有助于法医案件工作的DNA分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验