Suppr超能文献

三联体与无关个体的定相算法比较。

A comparison of phasing algorithms for trios and unrelated individuals.

作者信息

Marchini Jonathan, Cutler David, Patterson Nick, Stephens Matthew, Eskin Eleazar, Halperin Eran, Lin Shin, Qin Zhaohui S, Munro Heather M, Abecasis Goncalo R, Donnelly Peter

机构信息

Department of Statistics, University of Oxford, Oxford OX1 3TG, United Kingdom.

出版信息

Am J Hum Genet. 2006 Mar;78(3):437-50. doi: 10.1086/500808. Epub 2006 Jan 26.

Abstract

Knowledge of haplotype phase is valuable for many analysis methods in the study of disease, population, and evolutionary genetics. Considerable research effort has been devoted to the development of statistical and computational methods that infer haplotype phase from genotype data. Although a substantial number of such methods have been developed, they have focused principally on inference from unrelated individuals, and comparisons between methods have been rather limited. Here, we describe the extension of five leading algorithms for phase inference for handling father-mother-child trios. We performed a comprehensive assessment of the methods applied to both trios and to unrelated individuals, with a focus on genomic-scale problems, using both simulated data and data from the HapMap project. The most accurate algorithm was PHASE (v2.1). For this method, the percentages of genotypes whose phase was incorrectly inferred were 0.12%, 0.05%, and 0.16% for trios from simulated data, HapMap Centre d'Etude du Polymorphisme Humain (CEPH) trios, and HapMap Yoruban trios, respectively, and 5.2% and 5.9% for unrelated individuals in simulated data and the HapMap CEPH data, respectively. The other methods considered in this work had comparable but slightly worse error rates. The error rates for trios are similar to the levels of genotyping error and missing data expected. We thus conclude that all the methods considered will provide highly accurate estimates of haplotypes when applied to trio data sets. Running times differ substantially between methods. Although it is one of the slowest methods, PHASE (v2.1) was used to infer haplotypes for the 1 million-SNP HapMap data set. Finally, we evaluated methods of estimating the value of r(2) between a pair of SNPs and concluded that all methods estimated r(2) well when the estimated value was >or=0.8.

摘要

单倍型相位信息对于疾病、群体和进化遗传学研究中的许多分析方法都很有价值。人们投入了大量的研究精力来开发从基因型数据推断单倍型相位的统计和计算方法。尽管已经开发了大量此类方法,但它们主要集中于从不相关个体进行推断,并且方法之间的比较相当有限。在此,我们描述了用于处理父母 - 子女三联体的五种主要相位推断算法的扩展。我们对应用于三联体和不相关个体的方法进行了全面评估,重点关注基因组规模的问题,使用了模拟数据和来自HapMap项目的数据。最准确的算法是PHASE(v2.1)。对于该方法,来自模拟数据的三联体、HapMap人类多态性研究中心(CEPH)三联体和HapMap约鲁巴三联体中,相位被错误推断的基因型百分比分别为0.12%、0.05%和0.16%,而在模拟数据和HapMap CEPH数据中的不相关个体分别为5.2%和5.9%。本研究中考虑的其他方法具有可比但略高的错误率。三联体的错误率与预期的基因分型错误和缺失数据水平相似。因此,我们得出结论,当应用于三联体数据集时,所有考虑的方法都将提供高度准确的单倍型估计。方法之间的运行时间差异很大。尽管PHASE(v2.1)是最慢的方法之一,但它被用于推断100万个单核苷酸多态性(SNP)的HapMap数据集的单倍型。最后,我们评估了估计一对SNP之间r(2)值的方法,并得出结论,当估计值≥0.8时,所有方法对r(2)的估计都很好。

相似文献

4
2SNP: scalable phasing method for trios and unrelated individuals.2SNP:适用于三联体和无关个体的可扩展定相方法。
IEEE/ACM Trans Comput Biol Bioinform. 2008 Apr-Jun;5(2):313-8. doi: 10.1109/TCBB.2007.1068.
5
Using DNA pools for genotyping trios.使用DNA池对三联体进行基因分型。
Nucleic Acids Res. 2006;34(19):e129. doi: 10.1093/nar/gkl700. Epub 2006 Oct 4.
7

引用本文的文献

6
Imputation of ancient human genomes.古代人类基因组的推断。
Nat Commun. 2023 Jun 20;14(1):3660. doi: 10.1038/s41467-023-39202-0.
9
North Asian population relationships in a global context.全球背景下的北亚人群关系。
Sci Rep. 2022 May 4;12(1):7214. doi: 10.1038/s41598-022-10706-x.

本文引用的文献

3
A haplotype map of the human genome.人类基因组单倍型图谱。
Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验