Stolyarova Anastasia, Coop Graham, Przeworski Molly
Department of Biological Sciences, Columbia University, New York, NY 10027.
Department of Evolution and Ecology and Center for Population Biology, University of California, Davis, CA 95616.
Proc Natl Acad Sci U S A. 2025 May 27;122(21):e2503857122. doi: 10.1073/pnas.2503857122. Epub 2025 May 23.
A major focus of human genetics is to map severe disease mutations. Increasingly, that goal is understood as requiring huge numbers of people to be sequenced from every broadly defined genetic ancestry group, so as not to miss "ancestry-specific variants." Here, we consider whether this focus is warranted. We start from first principles considerations, based on models of mutation-drift-selection balance, which suggest that since severe disease mutations tend to be strongly deleterious, and thus evolutionarily young, they will be kept at relatively constant frequency through recurrent mutation. Therefore, highly pathogenic alleles should be shared identically by descent within extended families, not broad ancestry groups, and sequencing more people should yield similar numbers regardless of ancestry. We test the model predictions using gnomAD genetic ancestry groupings and show that they provide a good fit to the classes of variants most likely to be highly pathogenic, notably sets of loss of function alleles at strongly constrained genes. These findings clarify that strongly deleterious alleles will be found at comparable rates in people of all ancestries, and the information they provide about human biology is shared across ancestries.
人类遗传学的一个主要重点是绘制严重疾病突变图谱。人们越来越认识到,要实现这一目标,就需要对每个广义定义的遗传祖先群体中的大量人群进行测序,以免遗漏“特定祖先变体”。在此,我们探讨这种重点是否合理。我们从基于突变-漂变-选择平衡模型的第一性原理考虑出发,该模型表明,由于严重疾病突变往往具有很强的有害性,因此在进化上较为年轻,它们将通过反复突变保持在相对恒定的频率。因此,高致病性等位基因应在大家庭中通过血缘相同地共享,而不是在广泛的祖先群体中,并且无论祖先如何,对更多人进行测序应该会产生相似数量的结果。我们使用gnomAD遗传祖先分组来检验模型预测,结果表明它们与最有可能具有高致病性的变体类别非常吻合,特别是在强约束基因处的功能丧失等位基因集。这些发现表明,在所有祖先的人群中,强有害等位基因的发现率相当,并且它们提供的有关人类生物学的信息在不同祖先之间是共享的。