Suppr超能文献

通过家系分析估计参考基因型和非参考基因型的不同错误率。

Distinct error rates for reference and nonreference genotypes estimated by pedigree analysis.

机构信息

Department of Biology, Indiana University, Bloomington, IN 47405, USA.

Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA.

出版信息

Genetics. 2021 Mar 3;217(1):1-10. doi: 10.1093/genetics/iyaa014.

Abstract

Errors in genotype calling can have perverse effects on genetic analyses, confounding association studies, and obscuring rare variants. Analyses now routinely incorporate error rates to control for spurious findings. However, reliable estimates of the error rate can be difficult to obtain because of their variance between studies. Most studies also report only a single estimate of the error rate even though genotypes can be miscalled in more than one way. Here, we report a method for estimating the rates at which different types of genotyping errors occur at biallelic loci using pedigree information. Our method identifies potential genotyping errors by exploiting instances where the haplotypic phase has not been faithfully transmitted. The expected frequency of inconsistent phase depends on the combination of genotypes in a pedigree and the probability of miscalling each genotype. We develop a model that uses the differences in these frequencies to estimate rates for different types of genotype error. Simulations show that our method accurately estimates these error rates in a variety of scenarios. We apply this method to a dataset from the whole-genome sequencing of owl monkeys (Aotus nancymaae) in three-generation pedigrees. We find significant differences between estimates for different types of genotyping error, with the most common being homozygous reference sites miscalled as heterozygous and vice versa. The approach we describe is applicable to any set of genotypes where haplotypic phase can reliably be called and should prove useful in helping to control for false discoveries.

摘要

基因型调用错误会对遗传分析产生反常影响,干扰关联研究,并掩盖罕见变异。现在的分析通常会纳入错误率来控制虚假发现。然而,由于研究之间存在差异,可靠估计错误率可能很困难。大多数研究即使基因型可能以不止一种方式被误报,也只报告单一的错误率估计值。在这里,我们报告了一种使用系谱信息估计双等位基因座不同类型基因分型错误发生率的方法。我们的方法通过利用未忠实传递单倍型相位的实例来识别潜在的基因分型错误。不一致相位的预期频率取决于系谱中基因型的组合以及每种基因型误报的概率。我们开发了一种模型,该模型利用这些频率的差异来估计不同类型基因型错误的速率。模拟表明,我们的方法在各种情况下都能准确估计这些错误率。我们将此方法应用于三代系谱中全基因组测序的猫头鹰猴(Aotus nancymaae)数据集。我们发现不同类型基因分型错误的估计值存在显著差异,最常见的是纯合参考位点误报为杂合子,反之亦然。我们描述的方法适用于任何可以可靠调用单倍型相位的基因型集合,应该有助于控制假发现。

相似文献

1
Distinct error rates for reference and nonreference genotypes estimated by pedigree analysis.
Genetics. 2021 Mar 3;217(1):1-10. doi: 10.1093/genetics/iyaa014.
2
Genotype phasing in pedigrees using whole-genome sequence data.
Eur J Hum Genet. 2020 Jun;28(6):790-803. doi: 10.1038/s41431-020-0574-3. Epub 2020 Jan 29.
3
Detection of Mendelian consistent genotyping errors in pedigrees.
Genet Epidemiol. 2014 May;38(4):291-9. doi: 10.1002/gepi.21806. Epub 2014 Apr 9.
5
Characteristic and influencing factors of Taqman genotyping calling error.
J Clin Lab Anal. 2018 Nov;32(9):e22613. doi: 10.1002/jcla.22613. Epub 2018 Jun 26.
6
Quality control of genotypes using heritability estimates of gene content at the marker.
Genetics. 2015 Mar;199(3):675-81. doi: 10.1534/genetics.114.173559. Epub 2015 Jan 6.
7
RAD-sequencing for estimating genomic relatedness matrix-based heritability in the wild: A case study in roe deer.
Mol Ecol Resour. 2019 Sep;19(5):1205-1217. doi: 10.1111/1755-0998.13031. Epub 2019 Jun 12.
10
Estimating and accounting for genotyping errors in RAD-seq experiments.
Mol Ecol Resour. 2020 Jul;20(4):856-870. doi: 10.1111/1755-0998.13153. Epub 2020 Apr 6.

引用本文的文献

2
Unprecedented female mutation bias in the aye-aye, a highly unusual lemur from Madagascar.
PLoS Biol. 2025 Feb 7;23(2):e3003015. doi: 10.1371/journal.pbio.3003015. eCollection 2025 Feb.
3
PidTools: Algorithm and web tools for crop pedigree identification analysis.
Comput Struct Biotechnol J. 2024 Jul 5;23:2883-2891. doi: 10.1016/j.csbj.2024.07.004. eCollection 2024 Dec.
4
Simultaneous estimation of genotype error and uncalled deletion rates in whole genome sequence data.
PLoS Genet. 2024 May 24;20(5):e1011297. doi: 10.1371/journal.pgen.1011297. eCollection 2024 May.
7
Examining the Effects of Hibernation on Germline Mutation Rates in Grizzly Bears.
Genome Biol Evol. 2022 Oct 7;14(10). doi: 10.1093/gbe/evac148.
8
P-smoother: efficient PBWT smoothing of large haplotype panels.
Bioinform Adv. 2022 Jun 20;2(1):vbac045. doi: 10.1093/bioadv/vbac045. eCollection 2022.
10
Genotype error biases trio-based estimates of haplotype phase accuracy.
Am J Hum Genet. 2022 Jun 2;109(6):1016-1025. doi: 10.1016/j.ajhg.2022.04.019.

本文引用的文献

3
False-negative errors in next-generation sequencing contribute substantially to inconsistency of mutation databases.
PLoS One. 2019 Sep 12;14(9):e0222535. doi: 10.1371/journal.pone.0222535. eCollection 2019.
5
Analysis of error profiles in deep next-generation sequencing data.
Genome Biol. 2019 Mar 14;20(1):50. doi: 10.1186/s13059-019-1659-6.
6
Reproductive Longevity Predicts Mutation Rates in Primates.
Curr Biol. 2018 Oct 8;28(19):3193-3197.e5. doi: 10.1016/j.cub.2018.08.050. Epub 2018 Sep 27.
7
Extremely rare variants reveal patterns of germline mutation rate heterogeneity in humans.
Nat Commun. 2018 Sep 14;9(1):3753. doi: 10.1038/s41467-018-05936-5.
8
Systematic evaluation of error rates and causes in short samples in next-generation sequencing.
Sci Rep. 2018 Jul 19;8(1):10950. doi: 10.1038/s41598-018-29325-6.
9
A framework for the detection of de novo mutations in family-based sequencing data.
Eur J Hum Genet. 2017 Feb;25(2):227-233. doi: 10.1038/ejhg.2016.147. Epub 2016 Nov 23.
10
The rate of meiotic gene conversion varies by sex and age.
Nat Genet. 2016 Nov;48(11):1377-1384. doi: 10.1038/ng.3669. Epub 2016 Sep 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验