Qiao Ying, Jewett Ethan M, McManus Kimberly F, Freyman William A, Curran Joanne E, Williams-Blangero Sarah, Blangero John, Williams Amy L
Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA.
23andMe, Inc., Sunnyvale, CA 94086, USA.
bioRxiv. 2024 May 14:2024.05.10.593578. doi: 10.1101/2024.05.10.593578.
Reconstructing the DNA of ancestors from their descendants has the potential to empower phenotypic analyses (including association and genetic nurture studies), improve pedigree reconstruction, and shed light on the ancestral population and phenotypes of ancestors. We developed HAPI-RECAP, a method that reconstructs the DNA of parents from full siblings and their relatives. This tool leverages HAPI2's output, a new phasing approach that applies to siblings (and optionally one or both parents) and reliably infers parent haplotypes but does not link the ungenotyped parents' DNA across chromosomes or between segments flanking ambiguities. By combining IBD between the reconstructed parents and the relatives, HAPI-RECAP resolves the source parent of these segments. Moreover, the method exploits crossovers the children inherited and sex-specific genetic maps to infer the reconstructed parents' sexes. We validated these methods on research participants from both 23andMe, Inc. and the San Antonio Mexican American Family Studies. Given data for one parent, HAPI2 reconstructs large fractions of the missing parent's DNA, between 77.6% and 99.97% among all families, and 90.3% on average in three- and four-child families. When reconstructing both parents, HAPI-RECAP inferred between 33.2% and 96.6% of the parents' genotypes, averaging 70.6% in four-child families. Reconstructed genotypes have average error rates < 10, or comparable to those from direct genotyping. HAPI-RECAP inferred the parent sexes 100% correctly given IBD-linked segments and can also reconstruct parents without any IBD. As datasets grow in size, more families will be implicitly collected; HAPI-RECAP holds promise to enable high quality parent genotype reconstruction.
从后代重建祖先的DNA有潜力加强表型分析(包括关联研究和遗传养育研究)、改进谱系重建,并揭示祖先的群体和表型。我们开发了HAPI-RECAP,这是一种从全同胞及其亲属重建父母DNA的方法。该工具利用了HAPI2的输出结果,HAPI2是一种新的定相方法,适用于同胞(以及可选的一方或双方父母),能够可靠地推断父母单倍型,但不会跨染色体或在侧翼模糊区域之间连接未分型父母的DNA。通过结合重建父母与亲属之间的同源染色体片段,HAPI-RECAP解析了这些片段的来源父母。此外,该方法利用孩子继承的交叉和性别特异性遗传图谱来推断重建父母的性别。我们在来自23andMe公司和圣安东尼奥墨西哥裔美国家庭研究的研究参与者身上验证了这些方法。给定一方父母的数据,HAPI2能重建大部分缺失父母的DNA,在所有家庭中比例在77.6%至99.97%之间,在有三到四个孩子的家庭中平均为90.3%。当重建双方父母时,HAPI-RECAP推断出父母基因型的比例在33.2%至96.6%之间,在有四个孩子的家庭中平均为70.6%。重建的基因型平均错误率<10,或与直接基因分型的错误率相当。给定与同源染色体片段连锁的信息,HAPI-RECAP能100%正确推断父母性别,并且在没有任何同源染色体片段信息的情况下也能重建父母。随着数据集规模的扩大,将隐式收集更多家庭的数据;HAPI-RECAP有望实现高质量的父母基因型重建。