一种结合长程相位和长单倍型推断方法的 SNP 基因型相位推断。

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.

机构信息

University of New England, Armidale, Australia.

出版信息

Genet Sel Evol. 2011 Mar 10;43(1):12. doi: 10.1186/1297-9686-43-12.

DOI:10.1186/1297-9686-43-12

PMID:21388557

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3068938/

Abstract

BACKGROUND

Knowing the phase of marker genotype data can be useful in genome-wide association studies, because it makes it possible to use analysis frameworks that account for identity by descent or parent of origin of alleles and it can lead to a large increase in data quantities via genotype or sequence imputation. Long-range phasing and haplotype library imputation constitute a fast and accurate method to impute phase for SNP data.

METHODS

A long-range phasing and haplotype library imputation algorithm was developed. It combines information from surrogate parents and long haplotypes to resolve phase in a manner that is not dependent on the family structure of a dataset or on the presence of pedigree information.

RESULTS

The algorithm performed well in both simulated and real livestock and human datasets in terms of both phasing accuracy and computation efficiency. The percentage of alleles that could be phased in both simulated and real datasets of varying size generally exceeded 98% while the percentage of alleles incorrectly phased in simulated data was generally less than 0.5%. The accuracy of phasing was affected by dataset size, with lower accuracy for dataset sizes less than 1000, but was not affected by effective population size, family data structure, presence or absence of pedigree information, and SNP density. The method was computationally fast. In comparison to a commonly used statistical method (fastPHASE), the current method made about 8% less phasing mistakes and ran about 26 times faster for a small dataset. For larger datasets, the differences in computational time are expected to be even greater. A computer program implementing these methods has been made available.

CONCLUSIONS

The algorithm and software developed in this study make feasible the routine phasing of high-density SNP chips in large datasets.

摘要

背景

在全基因组关联研究中，了解标记基因型数据的相位可能很有用，因为它使得可以使用分析框架来解释等位基因的同源关系或亲源关系，并且可以通过基因型或序列推断大大增加数据量。长程相位和单倍型文库推断构成了一种快速准确的 SNP 数据相位推断方法。

方法

开发了一种长程相位和单倍型文库推断算法。它结合了替代父母和长单倍型的信息，以一种不依赖于数据集的家族结构或存在系谱信息的方式解决相位问题。

结果

该算法在模拟和真实的家畜和人类数据集的相位准确性和计算效率方面表现良好。在不同大小的模拟和真实数据集上，可相位的等位基因百分比通常超过 98%，而在模拟数据中错误相位的等位基因百分比通常小于 0.5%。相位准确性受数据集大小的影响，数据集小于 1000 时准确性较低，但不受有效群体大小、家族数据结构、系谱信息的存在与否以及 SNP 密度的影响。该方法计算速度快。与常用的统计方法（fastPHASE）相比，当前方法的相位错误少 8%左右，对于小数据集的运行速度快约 26 倍。对于更大的数据集，计算时间的差异预计会更大。已开发出一种实现这些方法的计算机程序。

结论

本研究开发的算法和软件使得在大型数据集上常规进行高密度 SNP 芯片相位推断成为可能。

相似文献

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.

Genet Sel Evol. 2011 Mar 10;43(1):12. doi: 10.1186/1297-9686-43-12.

Extending long-range phasing and haplotype library imputation algorithms to large and heterogeneous datasets.

Genet Sel Evol. 2020 Jul 8;52(1):38. doi: 10.1186/s12711-020-00558-2.

Detection of recombination events, haplotype reconstruction and imputation of sires using half-sib SNP genotypes.

Genet Sel Evol. 2014 Feb 4;46(1):11. doi: 10.1186/1297-9686-46-11.

A comparison of different algorithms for phasing haplotypes using Holstein cattle genotypes and pedigree data.

J Dairy Sci. 2017 Apr;100(4):2837-2849. doi: 10.3168/jds.2016-11590. Epub 2017 Feb 1.

Fast two-stage phasing of large-scale sequence data.

Am J Hum Genet. 2021 Oct 7;108(10):1880-1890. doi: 10.1016/j.ajhg.2021.08.005. Epub 2021 Sep 2.

Imputation of missing genotypes from sparse to high density using long-range phasing.

Genetics. 2011 Sep;189(1):317-27. doi: 10.1534/genetics.111.128082. Epub 2011 Jul 29.

A phasing and imputation method for pedigreed populations that results in a single-stage genomic evaluation.

Genet Sel Evol. 2012 Jun 19;44(1):9. doi: 10.1186/1297-9686-44-9.

Phasing quality assessment in a brown layer population through family- and population-based software.

BMC Genet. 2019 Jul 17;20(1):57. doi: 10.1186/s12863-019-0759-3.

A strategy to improve phasing of whole-genome sequenced individuals through integration of familial information from dense genotype panels.

Genet Sel Evol. 2017 May 16;49(1):46. doi: 10.1186/s12711-017-0321-6.

Recombination locations and rates in beef cattle assessed from parent-offspring pairs.

Genet Sel Evol. 2014 May 29;46(1):34. doi: 10.1186/1297-9686-46-34.

引用本文的文献

Accurate determination of breed origin of alleles in a simulated smallholder crossbred dairy cattle population.

Genet Sel Evol. 2025 Jul 11;57(1):35. doi: 10.1186/s12711-025-00985-z.

MAGE: metafounders-assisted genomic estimation of breeding value, a novel additive-dominance single-step model in crossbreeding systems.

Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae044.

A Bayesian Multivariate Gametic Model in a Reciprocal Cross with Genomic Information: An Example with Two Iberian Varieties.

Animals (Basel). 2023 May 16;13(10):1648. doi: 10.3390/ani13101648.

Imputation to whole-genome sequence and its use in genome-wide association studies for pork colour traits in crossbred and purebred pigs.

Front Genet. 2022 Oct 11;13:1022681. doi: 10.3389/fgene.2022.1022681. eCollection 2022.

Genomic prediction with whole-genome sequence data in intensely selected pig lines.

Genet Sel Evol. 2022 Sep 24;54(1):65. doi: 10.1186/s12711-022-00756-0.

Rare and population-specific functional variation across pig lines.

Genet Sel Evol. 2022 Jun 3;54(1):39. doi: 10.1186/s12711-022-00732-8.

Genotyping, the Usefulness of Imputation to Increase SNP Density, and Imputation Methods and Tools.

Methods Mol Biol. 2022;2467:113-138. doi: 10.1007/978-1-0716-2205-6_4.

Imputation of non-genotyped F1 dams to improve genetic gain in swine crossbreeding programs.

J Anim Sci. 2022 May 1;100(5). doi: 10.1093/jas/skac148.

Comparison of the choice of animals for re-sequencing in two maternal pig lines.

Genet Sel Evol. 2022 Feb 19;54(1):16. doi: 10.1186/s12711-022-00706-w.

Benchmarking phasing software with a whole-genome sequenced cattle pedigree.

BMC Genomics. 2022 Feb 15;23(1):130. doi: 10.1186/s12864-022-08354-6.

本文引用的文献

Imputation of missing genotypes from sparse to high density using long-range phasing.

Genetics. 2011 Sep;189(1):317-27. doi: 10.1534/genetics.111.128082. Epub 2011 Jul 29.

Marker imputation with low-density marker panels in Dutch Holstein cattle.

J Dairy Sci. 2010 Nov;93(11):5487-94. doi: 10.3168/jds.2010-3501.

Prediction of unobserved single nucleotide polymorphism genotypes of Jersey cattle using reference panels and population-based imputation algorithms.

J Dairy Sci. 2010 May;93(5):2229-38. doi: 10.3168/jds.2009-2849.

Parental origin of sequence variants associated with complex diseases.

Nature. 2009 Dec 17;462(7275):868-74. doi: 10.1038/nature08625.

Genotype imputation.

Annu Rev Genomics Hum Genet. 2009;10:387-406. doi: 10.1146/annurev.genom.9.081307.164242.

A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.

PLoS Genet. 2009 Jun;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. Epub 2009 Jun 19.

High-resolution haplotype block structure in the cattle genome.

BMC Genet. 2009 Apr 24;10:19. doi: 10.1186/1471-2156-10-19.

An examination of positive selection and changing effective population size in Angus and Holstein cattle populations (Bos taurus) using a high density SNP genotyping platform and the contribution of ancient polymorphism to genomic diversity in Domestic cattle.

BMC Genomics. 2009 Apr 24;10:181. doi: 10.1186/1471-2164-10-181.

Genomic selection using low-density marker panels.

Genetics. 2009 May;182(1):343-53. doi: 10.1534/genetics.108.100289. Epub 2009 Mar 18.

Detection of sharing by descent, long-range phasing and haplotype imputation.

Nat Genet. 2008 Sep;40(9):1068-75. doi: 10.1038/ng.216.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种结合长程相位和长单倍型推断方法的 SNP 基因型相位推断。

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.

机构信息

University of New England, Armidale, Australia.

出版信息

Genet Sel Evol. 2011 Mar 10;43(1):12. doi: 10.1186/1297-9686-43-12.

DOI:10.1186/1297-9686-43-12

PMID:21388557

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3068938/

Abstract

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

The algorithm and software developed in this study make feasible the routine phasing of high-density SNP chips in large datasets.

摘要

背景

方法

结果

结论

本研究开发的算法和软件使得在大型数据集上常规进行高密度 SNP 芯片相位推断成为可能。

一种结合长程相位和长单倍型推断方法的 SNP 基因型相位推断。

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

一种结合长程相位和长单倍型推断方法的 SNP 基因型相位推断。

A combined long-range phasing and long haplotype imputation method to impute phase for SNP genotypes.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献