• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种无重复基因型的微卫星数据等位基因缺失校正的最大似然法。

A maximum-likelihood method to correct for allelic dropout in microsatellite data with no replicate genotypes.

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.

出版信息

Genetics. 2012 Oct;192(2):651-69. doi: 10.1534/genetics.112.139519. Epub 2012 Jul 30.

DOI:10.1534/genetics.112.139519
PMID:22851645
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3660999/
Abstract

Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy-Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets imputed under our model can be investigated in additional subsequent analyses, our method will be useful for preparing data for applications in diverse contexts in population genetics and molecular ecology.

摘要

等位基因缺失是微卫星基因型中常见的缺失数据来源,在这种情况下,一个或两个基因座的等位基因拷贝未能被聚合酶链反应扩增。对于 DNA 质量较差的样本尤其如此,由于一个拷贝缺失时,杂合子被错误地分类为纯合子,因此这种问题会导致观察到的杂合度估计值向下偏差,近交系数估计值向上偏差。避免等位基因缺失的一种通用方法是重复对纯合基因座进行基因分型,以最小化实验误差的影响。现有的计算替代方法通常也需要重复基因分型。然而,这些方法成本高昂,并且仅在有足够的 DNA 可用于重复基因分型时才适用。在这项研究中,我们提出了一种最大似然方法,并结合期望最大化算法,在仅提供一组非重复基因型的情况下联合估计等位基因缺失率和等位基因频率。我们的方法考虑了由样本特异性因素和基因座特异性因素引起的等位基因缺失估计值,并且允许由于近交而偏离哈迪-温伯格平衡。使用估计的参数,我们通过在可能发生缺失的情况下对等位基因进行多次插补,来纠正观察到的杂合度估计值的偏差。使用模拟数据,我们表明我们的方法可以(1)有效地再现实际数据中观察到的缺失数据和杂合度模式;(2)正确估计模型参数,包括样本特异性缺失率、基因座特异性缺失率和近交系数;(3)成功纠正估计观察到的杂合度的向下偏差。我们发现,我们的方法对于由群体结构和除等位基因缺失以外的其他来源的基因分型错误引起的模型假设违反具有相当的稳健性。由于我们的模型下推断的数据可以在其他后续分析中进行研究,因此我们的方法将有助于为群体遗传学和分子生态学中不同背景下的应用准备数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/2e00d1d4b132/651fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/a983b377194e/651fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/ab8faa38f42c/651fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/04a90b4890ad/651fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/f9d1c3fd0372/651fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/a371fe50f709/651fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/4f7b5c80b8c9/651fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/3e1503604e49/651fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/200ca28feaab/651fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/2e00d1d4b132/651fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/a983b377194e/651fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/ab8faa38f42c/651fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/04a90b4890ad/651fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/f9d1c3fd0372/651fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/a371fe50f709/651fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/4f7b5c80b8c9/651fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/3e1503604e49/651fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/200ca28feaab/651fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff8f/3660999/2e00d1d4b132/651fig9.jpg

相似文献

1
A maximum-likelihood method to correct for allelic dropout in microsatellite data with no replicate genotypes.一种无重复基因型的微卫星数据等位基因缺失校正的最大似然法。
Genetics. 2012 Oct;192(2):651-69. doi: 10.1534/genetics.112.139519. Epub 2012 Jul 30.
2
Maximum-likelihood estimation of allelic dropout and false allele error rates from microsatellite genotypes in the absence of reference data.在缺乏参考数据的情况下,从微卫星基因型中对等位基因脱扣和假等位基因错误率进行最大似然估计。
Genetics. 2007 Feb;175(2):827-42. doi: 10.1534/genetics.106.064618. Epub 2006 Dec 18.
3
Maximum likelihood estimation of individual inbreeding coefficients and null allele frequencies.个体近亲繁殖系数和无效等位基因频率的最大似然估计。
Genet Res (Camb). 2012 Jun;94(3):151-61. doi: 10.1017/S0016672312000341. Epub 2012 Jul 18.
4
Assessing allelic dropout and genotype reliability using maximum likelihood.使用最大似然法评估等位基因脱扣和基因型可靠性。
Genetics. 2002 Jan;160(1):357-66. doi: 10.1093/genetics/160.1.357.
5
Allele frequencies of microsatellite loci for genetic characterization of a Sicilian bovine population.用于西西里牛群遗传特征分析的微卫星基因座的等位基因频率
Genet Mol Res. 2015 Jan 30;14(1):691-9. doi: 10.4238/2015.January.30.12.
6
Noninvasive genotyping and Mendelian analysis of microsatellites in African savannah elephants.非洲草原象微卫星的非侵入性基因分型与孟德尔分析
J Hered. 2005 Nov-Dec;96(6):679-87. doi: 10.1093/jhered/esi117. Epub 2005 Oct 26.
7
Hardy-Weinberg analysis of a large set of published association studies reveals genotyping error and a deficit of heterozygotes across multiple loci.对大量已发表的关联研究进行哈迪-温伯格分析,发现存在基因分型错误以及多个位点杂合子不足的情况。
Hum Genomics. 2008 Sep;3(1):36-52. doi: 10.1186/1479-7364-3-1-36.
8
Maximum-likelihood and markov chain monte carlo approaches to estimate inbreeding and effective size from allele frequency changes.基于等位基因频率变化估计近亲繁殖和有效种群大小的最大似然法和马尔可夫链蒙特卡罗方法
Genetics. 2003 Jul;164(3):1189-204. doi: 10.1093/genetics/164.3.1189.
9
Polymorphic microsatellite loci isolated from Cervus unicolor (Cervidae) show inbreeding in a domesticated population of Taiwan Sambar deer.从水鹿(鹿科)分离出的多态微卫星基因座显示台湾水鹿驯化种群存在近亲繁殖现象。
Genet Mol Res. 2014 May 23;13(2):3967-71. doi: 10.4238/2014.May.23.7.
10
Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.通过针对未分型二倍体基因型数据的期望最大化算法,对等位基因位点单倍型频率估计的准确性。
Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22.

引用本文的文献

1
Allelic dropout in the endoglin () gene caused by common duplication beyond the primer binding site.由引物结合位点以外的常见重复导致的内皮糖蛋白()基因中的等位基因缺失。
Front Genet. 2025 Jun 11;16:1571437. doi: 10.3389/fgene.2025.1571437. eCollection 2025.
2
Machine Learning Methods for Classifying Multiple Sclerosis and Alzheimer's Disease Using Genomic Data.使用基因组数据对多发性硬化症和阿尔茨海默病进行分类的机器学习方法
Int J Mol Sci. 2025 Feb 27;26(5):2085. doi: 10.3390/ijms26052085.
3
STRyper: A macOS application for microsatellite genotyping and chromatogram management.

本文引用的文献

1
Genetics in geographically structured populations: defining, estimating and interpreting F(ST).地理结构种群中的遗传学:定义、估计和解释F(ST)
Nat Rev Genet. 2009 Sep;10(9):639-50. doi: 10.1038/nrg2611.
2
Incorporating genotype uncertainty into mark-recapture-type models for estimating abundance using DNA samples.将基因型不确定性纳入用于利用DNA样本估计种群数量的标记重捕型模型。
Biometrics. 2009 Sep;65(3):833-40. doi: 10.1111/j.1541-0420.2008.01165.x. Epub 2009 Jan 23.
3
Genetic variation and population structure in native Americans.
STRyper:一款用于微卫星基因分型和色谱图管理的macOS应用程序。
PLoS One. 2025 Feb 20;20(2):e0318806. doi: 10.1371/journal.pone.0318806. eCollection 2025.
4
Population genetic differentiation of the ubiquitous brooding coral Pocillopora acuta along Phuket Island reefs in the Andaman Sea, Thailand.泰国安达曼海普吉岛环礁中普遍存在的珊瑚 Pocillopora acuta 的种群遗传分化。
BMC Ecol Evol. 2023 Aug 26;23(1):42. doi: 10.1186/s12862-023-02153-7.
5
Single-Cell Next-Generation Sequencing to Monitor Hematopoietic Stem-Cell Transplantation: Current Applications and Future Perspectives.单细胞下一代测序监测造血干细胞移植:当前应用与未来展望
Cancers (Basel). 2023 Apr 26;15(9):2477. doi: 10.3390/cancers15092477.
6
Appraising the Genetic Makeup of an Allochthonous Southern Pike Population: An Opportunity to Predict the Evolution of Introgressive Hybridization in Isolated Populations?评估外来白斑狗鱼种群的基因组成:这是预测隔离种群中渐渗杂交进化的契机吗?
Animals (Basel). 2023 Jan 22;13(3):380. doi: 10.3390/ani13030380.
7
Reconstructing tumor clonal lineage trees incorporating single-nucleotide variants, copy number alterations and structural variations.重建整合单核苷酸变异、拷贝数改变和结构变异的肿瘤克隆谱系树。
Bioinformatics. 2022 Jun 24;38(Suppl 1):i125-i133. doi: 10.1093/bioinformatics/btac253.
8
Natural Barcodes for Longitudinal Single Cell Tracking of Leukemic and Immune Cell Dynamics.自然条形码用于白血病和免疫细胞动力学的纵向单细胞跟踪。
Front Immunol. 2022 Jan 3;12:788891. doi: 10.3389/fimmu.2021.788891. eCollection 2021.
9
Impact of genotypic errors with equal and unequal family contribution on accuracy of genomic prediction in aquaculture using simulation.利用模拟研究水产养殖中具有相等和不相等家系贡献的基因型误差对基因组预测准确性的影响。
Sci Rep. 2021 Sep 15;11(1):18318. doi: 10.1038/s41598-021-97873-5.
10
Allelic Dropout Is a Common Phenomenon That Reduces the Diagnostic Yield of PCR-Based Sequencing of Targeted Gene Panels.等位基因脱扣是一种常见现象,它会降低基于聚合酶链式反应的靶向基因panel测序的诊断效率。
Front Genet. 2021 Feb 1;12:620337. doi: 10.3389/fgene.2021.620337. eCollection 2021.
美洲原住民的基因变异与种群结构
PLoS Genet. 2007 Nov;3(11):e185. doi: 10.1371/journal.pgen.0030185.
4
Using DNA to track the origin of the largest ivory seizure since the 1989 trade ban.利用DNA追踪自1989年贸易禁令以来最大规模象牙缉获的来源。
Proc Natl Acad Sci U S A. 2007 Mar 6;104(10):4228-33. doi: 10.1073/pnas.0609714104. Epub 2007 Feb 26.
5
Maximum-likelihood estimation of allelic dropout and false allele error rates from microsatellite genotypes in the absence of reference data.在缺乏参考数据的情况下,从微卫星基因型中对等位基因脱扣和假等位基因错误率进行最大似然估计。
Genetics. 2007 Feb;175(2):827-42. doi: 10.1534/genetics.106.064618. Epub 2006 Dec 18.
6
Towards unbiased parentage assignment: combining genetic, behavioural and spatial data in a Bayesian framework.迈向无偏差亲权鉴定:在贝叶斯框架中整合遗传、行为和空间数据
Mol Ecol. 2006 Oct;15(12):3715-30. doi: 10.1111/j.1365-294X.2006.03050.x.
7
Genotyping errors: causes, consequences and solutions.基因分型错误:原因、后果及解决方法。
Nat Rev Genet. 2005 Nov;6(11):847-59. doi: 10.1038/nrg1707.
8
Microsatellite genotyping errors: detection approaches, common sources and consequences for paternal exclusion.微卫星基因分型错误:检测方法、常见来源及对父权排除的影响
Mol Ecol. 2005 Feb;14(2):599-612. doi: 10.1111/j.1365-294X.2004.02419.x.
9
Quantifying genotyping errors in noninvasive population genetics.量化非侵入性群体遗传学中的基因分型错误。
Mol Ecol. 2004 Nov;13(11):3601-8. doi: 10.1111/j.1365-294X.2004.02352.x.
10
How to track and assess genotyping errors in population genetics studies.如何在群体遗传学研究中追踪和评估基因分型错误。
Mol Ecol. 2004 Nov;13(11):3261-73. doi: 10.1111/j.1365-294X.2004.02346.x.