Ringbauer Harald, Huang Yilei, Akbari Ali, Mallick Swapan, Patterson Nick, Reich David
Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.
Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA.
bioRxiv. 2023 Mar 9:2023.03.08.531671. doi: 10.1101/2023.03.08.531671.
Long DNA sequences shared between two individuals, known as Identical by descent (IBD) segments, are a powerful signal for identifying close and distant biological relatives because they only arise when the pair shares a recent common ancestor. Existing methods to call IBD segments between present-day genomes cannot be straightforwardly applied to ancient DNA data (aDNA) due to typically low coverage and high genotyping error rates. We present ancIBD, a method to identify IBD segments for human aDNA data implemented as a Python package. Our approach is based on a Hidden Markov Model, using as input genotype probabilities imputed based on a modern reference panel of genomic variation. Through simulation and downsampling experiments, we demonstrate that ancIBD robustly identifies IBD segments longer than 8 centimorgan for aDNA data with at least either 0.25x average whole-genome sequencing (WGS) coverage depth or at least 1x average depth for in-solution enrichment experiments targeting a widely used aDNA SNP set ('1240k'). This application range allows us to screen a substantial fraction of the aDNA record for IBD segments and we showcase two downstream applications. First, leveraging the fact that biological relatives up to the sixth degree are expected to share multiple long IBD segments, we identify relatives between 10,156 ancient Eurasian individuals and document evidence of long-distance migration, for example by identifying a pair of two approximately fifth-degree relatives who were buried 1410km apart in Central Asia 5000 years ago. Second, by applying ancIBD, we reveal new details regarding the spread of ancestry related to Steppe pastoralists into Europe starting 5000 years ago. We find that the first individuals in Central and Northern Europe carrying high amounts of Steppe-ancestry, associated with the Corded Ware culture, share high rates of long IBD (12-25 cM) with Yamnaya herders of the Pontic-Caspian steppe, signaling a strong bottleneck and a recent biological connection on the order of only few hundred years, providing evidence that the Yamnaya themselves are a main source of Steppe ancestry in Corded Ware people. We also detect elevated sharing of long IBD segments between Corded Ware individuals and people associated with the Globular Amphora culture (GAC) from Poland and Ukraine, who were Copper Age farmers not yet carrying Steppe-like ancestry. These IBD links appear for all Corded Ware groups in our analysis, indicating that individuals related to GAC contexts must have had a major demographic impact early on in the genetic admixtures giving rise to various Corded Ware groups across Europe. These results show that detecting IBD segments in aDNA can generate new insights both on a small scale, relevant to understanding the life stories of people, and on the macroscale, relevant to large-scale cultural-historical events.
两个个体之间共享的长DNA序列,即同源相同(IBD)片段,是识别远近生物学亲属的有力信号,因为它们仅在两人共享一个近代共同祖先时才会出现。由于古代DNA数据(aDNA)通常覆盖率低且基因分型错误率高,现有的在现代基因组之间识别IBD片段的方法不能直接应用于aDNA数据。我们提出了ancIBD,这是一种用于识别人类aDNA数据中IBD片段的方法,它被实现为一个Python包。我们的方法基于隐马尔可夫模型,使用基于现代基因组变异参考面板估算的基因型概率作为输入。通过模拟和下采样实验,我们证明,对于平均全基因组测序(WGS)覆盖率深度至少为0.25x或针对广泛使用的aDNA SNP集(“1240k”)的溶液内富集实验平均深度至少为1x的aDNA数据,ancIBD能够稳健地识别长度超过8厘摩的IBD片段。这个应用范围使我们能够在aDNA记录的很大一部分中筛选IBD片段,并且我们展示了两个下游应用。首先,利用高达六度的生物学亲属预计会共享多个长IBD片段这一事实,我们在10156个古代欧亚个体之间识别亲属,并记录远距离迁移的证据,例如识别出一对大约为五度亲属的个体,他们于5000年前被埋葬在中亚相距1410公里的地方。其次,通过应用ancIBD,我们揭示了与5000年前开始的草原牧民血统向欧洲传播相关的新细节。我们发现,中欧和北欧最早携带大量与绳纹器文化相关的草原血统的个体,与亚姆纳亚草原的牧民共享高比例的长IBD(12 - 25厘摩),这表明存在一个强烈的瓶颈效应以及仅在几百年内的近代生物学联系,这证明亚姆纳亚人本身是绳纹器人群中草原血统的主要来源。我们还检测到绳纹器个体与来自波兰和乌克兰的与球形瓮文化(GAC)相关的人群之间长IBD片段的共享增加,后者是尚未携带类似草原血统的铜器时代农民。在我们的分析中,所有绳纹器群体都出现了这些IBD联系,表明与GAC背景相关的个体在导致欧洲各地各种绳纹器群体形成的基因混合早期必定产生了重大的人口统计学影响。这些结果表明,在aDNA中检测IBD片段能够在小规模上产生新的见解,这与理解个体的生活故事相关,同时也能在大规模上产生新见解,这与大规模文化历史事件相关。