Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA.
Institute of Forensic Science, Istanbul University-Cerrahpasa, Istanbul 34500, Turkey.
Forensic Sci Int Genet. 2021 Jul;53:102528. doi: 10.1016/j.fsigen.2021.102528. Epub 2021 May 14.
The Southwest Asian, circum-Mediterranean, and Southern European populations (collectively, SWAMSE) together with Northern European populations form one of five "continental" groups of global populations in many analyses of population relationships. This region is of great anthropologic and forensic interest but relationships of large numbers of populations within the region have not been able to be cleanly resolved with autosomal genetic markers. To examine the genetic boundaries to the SWAMSE region and whether internal structure can be detected we have assembled data for a total of 151 separate autosomal genetic markers on populations in this region and other parts of the world for a global set of 95 populations. The markers include 83 ancestry informative SNPs as singletons and 68 microhaplotype loci defined by 204 SNPs. The 151 loci are ancestry informative on a global scale, identifying at least five biogeographic clusters. One of those clusters is a clear grouping of 37 populations containing the SWAMSE plus northern European populations to the exclusion of populations in South Central Asia and populations from farther East. A refined analysis of the 37 populations shows the northern European populations clustering separately from the SWAMSE populations. Within Southwest Asia the Samaritans and Shabaks are distinct outliers. The Yemenite Jews, Saudi, Kuwaiti, Palestinian Arabs, and Southern Tunisians cluster together loosely while the remaining populations from Northern Iraq, Mediterranean Europe, the Caucasus region, and Iran cluster in a more complex graded fashion. The majority of the SWAMSE populations from the mainland of Southwest Asia form a cluster with little internal structure reflecting a very complex history of endogamy and migrations. The set of 151 DNA polymorphisms not only distinguishes major geographical regions globally but can distinguish ancestry to a small degree within geographical regions such as SWAMSE. We discuss forensic characteristics of the polymorphisms and also identify those that rank highest by Rosenberg's I measure for the SWAMSE region populations and for the global set of populations analyzed. DATA AVAILABILITY: Genotypes on all 151 markers for all 3790 individuals typed in the Kidd Lab on the 72 Kidd lab populations have been deposited in the Zenodo archive and can be freely accessed at https://doi.org/10.5281/zenodo.4658892. Some of the data has been made public previously as supplemental files appended to publications. Data for the additional individuals included in the analyses was taken from already public datasets as indicated in the text.
西南亚、环地中海和南欧人群(统称为 SWAMSE)与北欧人群一起,在许多种群关系分析中构成了全球五个“大陆”群体之一。该地区具有重要的人类学和法医学意义,但该地区大量人群的关系尚未能够通过常染色体遗传标记清晰地解决。为了研究 SWAMSE 地区的遗传边界以及是否可以检测到内部结构,我们收集了该地区和世界其他地区共 95 个人群的 151 个独立常染色体遗传标记的数据。这些标记包括 83 个作为单倍型的祖先信息 SNP 和 68 个由 204 个 SNP 定义的微单倍型基因座。这 151 个标记在全球范围内具有祖先信息,可以确定至少五个生物地理集群。其中一个集群是一个清晰的 37 个人群分组,包含 SWAMSE 加上北欧人群,排除了来自南亚中部和更远东方的人群。对这 37 个人群的精细分析表明,北欧人群与 SWAMSE 人群分开聚类。在西南亚内部,撒玛利亚人和沙巴克人是明显的异常值。也门犹太人、沙特阿拉伯人、科威特人、巴勒斯坦阿拉伯人和突尼斯南部人松散地聚集在一起,而来自伊拉克北部、地中海欧洲、高加索地区和伊朗的其余人口则以更复杂的分级方式聚集在一起。来自西南亚大陆的大多数 SWAMSE 人群形成一个具有很少内部结构的集群,反映出非常复杂的近亲繁殖和迁徙历史。这组 151 个 DNA 多态性不仅可以区分全球主要地理区域,还可以在地理区域(如 SWAMSE)内区分祖先的细微差异。我们讨论了多态性的法医学特征,并根据罗森伯格的 I 度量值,确定了在 SWAMSE 地区人群和分析的全球人群中排名最高的多态性。数据可用性:在 Kidd 实验室对 72 个 Kidd 实验室人群中的 3790 个人进行分型的所有 151 个标记的基因型已被存入 Zenodo 档案库,可以在 https://doi.org/10.5281/zenodo.4658892 免费访问。其中一些数据以前已作为附加文件公布在出版物中。分析中包含的其他个体的数据取自已公开的数据集,如文本中所指出的。