23andMe, Inc. Sunnyvale, CA 94086, USA.
23andMe, Inc. Sunnyvale, CA 94086, USA.
Am J Hum Genet. 2021 Nov 4;108(11):2052-2070. doi: 10.1016/j.ajhg.2021.09.013.
Pedigree inference from genotype data is a challenging problem, particularly when pedigrees are sparsely sampled and individuals may be distantly related to their closest genotyped relatives. We present a method that infers small pedigrees of close relatives and then assembles them into larger pedigrees. To assemble large pedigrees, we introduce several formulas and tools including a likelihood for the degree separating two small pedigrees, a generalization of the fast DRUID point estimate of the degree separating two pedigrees, a method for detecting individuals who share background identity-by-descent (IBD) that does not reflect recent common ancestry, and a method for identifying the ancestral branches through which distant relatives are connected. Our method also takes several approaches that help to improve the accuracy and efficiency of pedigree inference. In particular, we incorporate age information directly into the likelihood rather than using ages only for consistency checks and we employ a heuristic branch-and-bound-like approach to more efficiently explore the space of possible pedigrees. Together, these approaches make it possible to construct large pedigrees that are challenging or intractable for current inference methods.
从基因型数据推断系谱是一个具有挑战性的问题,特别是当系谱采样稀疏且个体与其最近的基因分型亲属可能存在较远的关系时。我们提出了一种方法,该方法可以推断出近亲的小系谱,然后将它们组装成更大的系谱。为了组装大系谱,我们引入了几个公式和工具,包括两个小系谱之间分离度的似然函数、两个系谱之间分离度的快速 DRUID 点估计的推广、一种用于检测共享背景同源性(IBD)但不反映最近共同祖先的个体的方法,以及一种识别通过其连接远缘亲属的祖先分支的方法。我们的方法还采用了几种方法,有助于提高系谱推断的准确性和效率。特别是,我们直接将年龄信息纳入似然函数中,而不是仅将年龄用于一致性检查,并且我们采用启发式分支定界类似的方法来更有效地探索可能的系谱空间。这些方法结合在一起,使得构建对于当前推断方法具有挑战性或难以处理的大型系谱成为可能。