Department of Genetics, Yale University School of Medicine, New Haven, CT, 06520, USA.
Center for Medical Informatics, Yale University School of Medicine, New Haven, CT, 06520, USA.
Sci Rep. 2019 Dec 11;9(1):18874. doi: 10.1038/s41598-019-55175-x.
The benefits of ancestry informative SNP (AISNP) panels can best accrue and be properly evaluated only as sufficient reference population data become readily accessible. Ideally the set of reference populations should approximate the genetic diversity of human populations worldwide. The Kidd and Seldin AISNP sets are two panels that have separately accumulated thus far the largest and most diverse collections of data on human reference populations from the major continental regions. A recent tally in the ALFRED allele frequency database finds 164 reference populations available for all the 55 Kidd AISNPs and 132 reference populations for all the 128 Seldin AISNPs. Although much more of the genetic diversity in human populations around the world still needs to be documented, 81 populations have genotype data available for all 170 AISNPs in the union of the Kidd and Seldin panels. In this report we examine admixture and principal component analyses on these 81 worldwide populations and some regional subsets of these reference populations to determine how well the combined panel illuminates population relationships. Analyses of this dataset that focused on Native American populations revealed very strong cluster patterns associated with many of the individual populations studied.
只有当足够的参考人群数据变得易于获取时,祖先信息 SNP(AISNP)面板的优势才能最好地积累并得到适当的评估。理想情况下,参考人群集应大致接近全球人类群体的遗传多样性。Kidd 和 Seldin AISNP 面板是两个迄今为止分别积累了来自主要大陆地区人类参考人群的最大和最多样化数据集的面板。在 ALFRED 等位基因频率数据库中的最近一次统计中,发现 55 个 Kidd AISNPs 全部有 164 个参考人群,128 个 Seldin AISNPs 全部有 132 个参考人群。尽管全世界人类群体的更多遗传多样性仍需要记录,但在 Kidd 和 Seldin 面板的联合中,有 81 个人群的所有 170 个 AISNPs 都有基因型数据。在本报告中,我们研究了这 81 个全球人群以及这些参考人群的一些区域子集的混合和主成分分析,以确定组合面板如何很好地阐明人群关系。对该数据集的分析侧重于美洲原住民群体,揭示了与许多研究的个体群体相关的非常强烈的聚类模式。