Suppr超能文献

AIM-SNPtag:一种用于开发具有遗传背景信息的 SNP 面板的计算高效方法。

AIM-SNPtag: A computationally efficient approach for developing ancestry-informative SNP panels.

机构信息

CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.

CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Forensic Sci Int Genet. 2019 Jan;38:245-253. doi: 10.1016/j.fsigen.2018.10.015. Epub 2018 Nov 2.

Abstract

Inferring an individual's ancestry or group membership using a small set of highly informative genetic markers is very useful in forensic and medical genetics. However, given the huge amount of SNP data available from a diverse of populations, it is challenging to develop informative panels by exhaustively searching for all possible SNP combinations. In this study, we formulate it as an algorithm problem of selecting an optimal set of SNPs that maximizes the inference accuracy while minimizes the set size. Built on this conception, we develop a computational approach that is capable of constructing ancestry informative panels from multi-population genome-wide SNP data efficiently. We evaluated the performance of the method by comparing the panel size and membership inference accuracy of the constructed SNP panels to panels selected through empirical procedures in previous studies. For the membership inference of population groups including Asian, European, African, East Asian and Southeast Asian, a 36-SNP panel developed by our approach has an overall accuracy of 99.07%, and a 21-SNP subset of the panel has an overall accuracy of 95.36%. In comparison, an existing panel requires 74 SNPs to achieve an accuracy of 94.14% on the same set of population groups. We further apply the method to four subpopulations within Europe (Finnish, British, Spanish and Italian); a 175-SNP panel can discriminate individuals of those European subpopulations with an accuracy of 99.36%, of which a 68-SNP subset can achieve an accuracy of 95.07%. We expect our method to be a useful tool for constructing ancestry informative markers in forensic genetics.

摘要

利用一小部分信息量丰富的遗传标记推断个体的祖先或群体归属,在法医学和医学遗传学中非常有用。然而,考虑到来自不同人群的 SNP 数据数量巨大,通过穷尽搜索所有可能的 SNP 组合来开发信息量丰富的面板是具有挑战性的。在这项研究中,我们将其表述为选择一组最优 SNP 的算法问题,这些 SNP 最大限度地提高了推断准确性,同时最小化了集合大小。在此概念的基础上,我们开发了一种能够从多人群全基因组 SNP 数据中有效构建祖先信息面板的计算方法。我们通过将构建的 SNP 面板与之前研究中通过经验程序选择的面板的面板大小和成员推断准确性进行比较,来评估该方法的性能。对于包括亚洲、欧洲、非洲、东亚和东南亚在内的人群群体的成员推断,我们的方法开发的 36-SNP 面板的总体准确性为 99.07%,而面板的 21-SNP 子集的总体准确性为 95.36%。相比之下,现有的面板需要 74 个 SNP 才能在相同的人群群体上达到 94.14%的准确性。我们进一步将该方法应用于欧洲的四个亚群(芬兰人、英国人、西班牙人和意大利人);一个 175-SNP 面板可以以 99.36%的准确性区分这些欧洲亚群的个体,其中一个 68-SNP 子集可以达到 95.07%的准确性。我们希望我们的方法成为法医遗传学中构建祖先信息标记的有用工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验