存在基因分型错误时对混合DNA样本中的单倍型频率进行估计。

Estimating haplotype frequencies in pooled DNA samples when there is genotyping error.

作者信息

Quade Shannon R E, Elston Robert C, Goddard Katrina A B

机构信息

Department of Epidemiology and Biostatistics, Case Western Reserve University, 2103 Cornell Rd, Cleveland, Ohio 44106-7281, USA.

出版信息

BMC Genet. 2005 May 19;6:25. doi: 10.1186/1471-2156-6-25.

DOI:10.1186/1471-2156-6-25

PMID:15943883

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1156884/

Abstract

BACKGROUND

Maximum likelihood estimates of haplotype frequencies can be obtained from pooled DNA using the expectation maximization (EM) algorithm. Through simulation, we investigate the effect of genotyping error on the accuracy of haplotype frequency estimates obtained using this algorithm. We explore model parameters including allele frequency, inter-marker linkage disequilibrium (LD), genotyping error rate, and pool size.

RESULTS

Pool sizes of 2, 5, and 10 individuals achieved comparable levels of accuracy in the estimation procedure. Common marker allele frequencies and no inter-marker LD result in less accurate estimates. This pattern is observed regardless of the amount of genotyping error simulated.

CONCLUSION

Genotyping error slightly decreases the accuracy of haplotype frequency estimates. However, the EM algorithm performs well even in the presence of genotyping error. Overall, pools of 2, 5, and 10 individuals yield similar accuracy of the haplotype frequency estimates, while reducing costs due to genotyping.

摘要

背景

单倍型频率的最大似然估计可通过使用期望最大化（EM）算法从混合DNA中获得。通过模拟，我们研究了基因分型错误对使用该算法获得的单倍型频率估计准确性的影响。我们探讨了包括等位基因频率、标记间连锁不平衡（LD）、基因分型错误率和混合样本大小在内的模型参数。

结果

在估计过程中，2、5和10个个体的混合样本大小达到了相当的准确性水平。常见标记等位基因频率和无标记间LD会导致估计准确性降低。无论模拟的基因分型错误量如何，都会观察到这种模式。

结论

基因分型错误会略微降低单倍型频率估计的准确性。然而，即使存在基因分型错误，EM算法仍表现良好。总体而言，2、5和10个个体的混合样本产生的单倍型频率估计准确性相似，同时降低了基因分型成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35f5/1156884/1700c99979c9/1471-2156-6-25-1.jpg

相似文献

Estimating haplotype frequencies in pooled DNA samples when there is genotyping error.

BMC Genet. 2005 May 19;6:25. doi: 10.1186/1471-2156-6-25.

Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.

Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22.

Incorporating genotyping uncertainty in haplotype frequency estimation in pedigree studies.

Hum Hered. 2007;64(3):172-81. doi: 10.1159/000102990. Epub 2007 May 25.

PoooL: an efficient method for estimating haplotype frequencies from large DNA pools.

Bioinformatics. 2008 Sep 1;24(17):1942-8. doi: 10.1093/bioinformatics/btn324. Epub 2008 Jun 23.

The impact of genotyping error on haplotype reconstruction and frequency estimation.

Eur J Hum Genet. 2002 Oct;10(10):616-22. doi: 10.1038/sj.ejhg.5200855.

HAPLOPOOL: improving haplotype frequency estimation through DNA pools and phylogenetic modeling.

Bioinformatics. 2007 Nov 15;23(22):3048-55. doi: 10.1093/bioinformatics/btm435. Epub 2007 Sep 25.

Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data.

Am J Hum Genet. 2003 Feb;72(2):384-98. doi: 10.1086/346116. Epub 2003 Jan 17.

Estimating population haplotype frequencies from pooled DNA samples using PHASE algorithm.

Genet Res (Camb). 2008 Dec;90(6):509-24. doi: 10.1017/S0016672308009877.

Maximum-likelihood estimation of haplotype frequencies in nuclear families.

Genet Epidemiol. 2004 Jul;27(1):21-32. doi: 10.1002/gepi.10323.

Computationally feasible estimation of haplotype frequencies from pooled DNA with and without Hardy-Weinberg equilibrium.

Bioinformatics. 2009 Feb 1;25(3):379-86. doi: 10.1093/bioinformatics/btn623. Epub 2008 Dec 2.

引用本文的文献

Associations between polymorphisms in the myostatin gene with calving difficulty and carcass merit in cattle.

J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skad371.

Impact of genotyping errors on the type I error rate and the power of haplotype-based association methods.

BMC Genet. 2009 Jan 29;10:3. doi: 10.1186/1471-2156-10-3.

Genotyping error detection in samples of unrelated individuals without replicate genotyping.

Hum Hered. 2009;67(3):154-62. doi: 10.1159/000181153. Epub 2008 Dec 15.

Estimating the single nucleotide polymorphism genotype misclassification from routine double measurements in a large epidemiologic sample.

Am J Epidemiol. 2008 Oct 15;168(8):878-89. doi: 10.1093/aje/kwn208. Epub 2008 Sep 12.

A survey on haplotyping algorithms for tightly linked markers.

J Bioinform Comput Biol. 2008 Feb;6(1):241-59. doi: 10.1142/s0219720008003369.

A high-throughput method for quantifying alleles and haplotypes of the malaria vaccine candidate Plasmodium falciparum merozoite surface protein-1 19 kDa.

Malar J. 2006 Apr 20;5:31. doi: 10.1186/1475-2875-5-31.

本文引用的文献

The impacts of errors in individual genotyping and DNA pooling on association studies.

Genet Epidemiol. 2004 Jan;26(1):1-10. doi: 10.1002/gepi.10277.

Haplotype frequency estimation in the presence of genotyping errors.

Hum Hered. 2003;56(1-3):131-8. doi: 10.1159/000073741.

Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data.

Am J Hum Genet. 2003 Feb;72(2):384-98. doi: 10.1086/346116. Epub 2003 Jan 17.

On the use of DNA pooling to estimate haplotype frequencies.

Genet Epidemiol. 2003 Jan;24(1):74-82. doi: 10.1002/gepi.10195.

Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms.

Hum Hered. 2002;54(1):22-33. doi: 10.1159/000066696.

DNA Pooling: a tool for large-scale association studies.

Nat Rev Genet. 2002 Nov;3(11):862-71. doi: 10.1038/nrg930.

The impact of genotyping error on haplotype reconstruction and frequency estimation.

Eur J Hum Genet. 2002 Oct;10(10):616-22. doi: 10.1038/sj.ejhg.5200855.

SNP genotyping on pooled DNAs: comparison of genotyping technologies and a semi automated method for data storage and analysis.

Nucleic Acids Res. 2002 Aug 1;30(15):e74. doi: 10.1093/nar/gnf070.

Universal, robust, highly quantitative SNP allele frequency measurement in DNA pools.

Hum Genet. 2002 May;110(5):471-8. doi: 10.1007/s00439-002-0706-6. Epub 2002 Mar 23.

Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data.

Am J Hum Genet. 2002 Feb;70(2):487-95. doi: 10.1086/338919. Epub 2002 Jan 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

存在基因分型错误时对混合DNA样本中的单倍型频率进行估计。

Estimating haplotype frequencies in pooled DNA samples when there is genotyping error.

作者信息

Quade Shannon R E, Elston Robert C, Goddard Katrina A B

机构信息

Department of Epidemiology and Biostatistics, Case Western Reserve University, 2103 Cornell Rd, Cleveland, Ohio 44106-7281, USA.