Manipal Centre for Natural Sciences (MCNS), Manipal Academy of Higher Education, Manipal, Karnataka, India.
Department of Medical Genetics, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, Karnataka, India.
Genome Biol Evol. 2018 Sep 1;10(9):2408-2416. doi: 10.1093/gbe/evy182.
The inference of genomic ancestry using ancestry informative markers (AIMs) can be useful for a range of studies in evolutionary genetics, biomedical research, and forensic analyses. However, the determination of AIMs for highly admixed populations with complex ancestries has remained a formidable challenge. Given the immense genetic heterogeneity and unique population structure of the Indian subcontinent, here we sought to derive AIMs that would yield a cohesive and faithful understanding of South Asian genetic origins. To discern the most optimal strategy for extracting AIMs for South Asians we compared three commonly used AIMs-determining methods namely, Infocalc, FST, and Smart Principal Component Analysis with ADMIXTURE, using previously published whole genome data from the Indian subcontinent. Our findings suggest that the Infocalc approach is likely most suitable for delineation of South Asian AIMs. In particular, Infocalc-2,000 (N = 2,000) appeared as the most informative South Asian AIMs panel that recapitulated the finer structure within South Asian genomes with high degree of sensitivity and precision, whereas a negative control with an equivalent number of randomly selected markers when used to interrogate the South Asian populations, failed to do so. We discuss the utility of all approaches under evaluation for AIMs derivation and interpreting South Asian genomic ancestries. Notably, this is the first report of an AIMs panel for South Asian ancestry inference. Overall these findings may aid in developing cost-effective resources for large-scale demographic analyses and foster expansion of our knowledge of human origins and disease, in the South Asian context.
使用遗传标记(AIMs)推断基因组起源可用于进化遗传学、生物医学研究和法医分析等一系列研究。然而,对于具有复杂起源的高度混合人群,确定 AIMs 仍然是一个艰巨的挑战。鉴于印度次大陆巨大的遗传异质性和独特的人口结构,我们在这里试图得出一些 AIMs,以全面深入地了解南亚的遗传起源。为了确定提取南亚 AIMs 的最佳策略,我们比较了三种常用的 AIMs 确定方法,即 Infocalc、FST 和 Smart Principal Component Analysis 与 ADMIXTURE,使用了先前发表的印度次大陆全基因组数据。我们的研究结果表明,Infocalc 方法最适合用于划定南亚 AIMs。特别是,Infocalc-2000(N=2000)似乎是最具信息量的南亚 AIMs 面板,能够高度敏感和精确地再现南亚基因组内的细微结构,而使用数量相等的随机选择标记作为阴性对照来研究南亚人群时,则无法做到这一点。我们讨论了所有评估方法在推导和解释南亚基因组起源方面的实用性。值得注意的是,这是首次报道用于推断南亚祖先的 AIMs 面板。总的来说,这些发现可能有助于开发用于大规模人口分析的具有成本效益的资源,并促进我们在南亚背景下对人类起源和疾病的认识的扩展。