• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用期望最大化算法对基因型数据进行连锁不平衡检验。

Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm.

作者信息

Slatkin M, Excoffier L

机构信息

Department of Integrative Biology, University of California, Berkeley 94720-3140, USA.

出版信息

Heredity (Edinb). 1996 Apr;76 ( Pt 4):377-83. doi: 10.1038/hdy.1996.55.

DOI:10.1038/hdy.1996.55
PMID:8626222
Abstract

We generalize an approach suggested by Hill (Heredity, 33, 229-239, 1974) for testing for significant association among alleles at two loci when only genotype and not haplotype frequencies are available. The principle is to use the Expectation-Maximization (EM) algorithm to resolve double heterozygotes into haplotypes and then apply a likelihood ratio test in order to determine whether the resolutions of haplotypes are significantly nonrandom, which is equivalent to testing whether there is statistically significant linkage disequilibrium between loci. The EM algorithm in this case relies on the assumption that genotype frequencies at each locus are in Hardy-Weinberg proportions. This method can accommodate X-linked loci and samples from haplodiploid species. We use three methods for testing significance of the likelihood ratio: the empirical distribution in a large number of randomized data sets, the X2 approximation for the distribution of likelihood ratios, and the Z2 test. The performance of each method is evaluated by applying it to simulated data sets and comparing the tail probability with the tail probability from Fisher's exact test applied to the actual haplotype data. For realistic sample sizes (50-150 individuals) all three methods perform well with two or three alleles per locus, but only the empirical distribution is adequate when there are five to eight alleles per locus, as is typical of hypervariable loci such as microsatellites. The method is applied to a data set of 32 microsatellite loci in a Finnish population and the results confirm the theoretical predictions. We conclude that with highly polymorphic loci, the EM algorithm does lead to a useful test for linkage disequilibrium, but that it is necessary to find the empirical distribution of likelihood ratios in order to perform a test of significance correctly.

摘要

我们推广了希尔(《遗传》,第33卷,第229 - 239页,1974年)提出的一种方法,用于在仅已知基因型频率而非单倍型频率的情况下,检验两个基因座上等位基因之间的显著关联。其原理是使用期望最大化(EM)算法将双杂合子解析为单倍型,然后应用似然比检验来确定单倍型的解析是否显著非随机,这等同于检验基因座之间是否存在统计学上显著的连锁不平衡。在这种情况下,EM算法依赖于每个基因座的基因型频率符合哈迪 - 温伯格比例这一假设。该方法可适用于X连锁基因座和来自单倍二倍体物种的样本。我们使用三种方法来检验似然比的显著性:大量随机数据集的经验分布、似然比分布的卡方近似以及Z²检验。通过将每种方法应用于模拟数据集,并将尾部概率与应用于实际单倍型数据的费舍尔精确检验的尾部概率进行比较,来评估每种方法的性能。对于实际样本量(50 - 150个个体),当每个基因座有两到三个等位基因时,所有三种方法都表现良好,但当每个基因座有五到八个等位基因时,只有经验分布是足够的,这在微卫星等高度可变基因座中很典型。该方法应用于芬兰人群中32个微卫星基因座的数据集,结果证实了理论预测。我们得出结论,对于高度多态的基因座,EM算法确实能为连锁不平衡提供一种有用的检验方法,但为了正确进行显著性检验,有必要找到似然比的经验分布。

相似文献

1
Testing for linkage disequilibrium in genotypic data using the Expectation-Maximization algorithm.使用期望最大化算法对基因型数据进行连锁不平衡检验。
Heredity (Edinb). 1996 Apr;76 ( Pt 4):377-83. doi: 10.1038/hdy.1996.55.
2
An E-M algorithm and testing strategy for multiple-locus haplotypes.一种多位点单倍型的期望最大化(E-M)算法及检验策略。
Am J Hum Genet. 1995 Mar;56(3):799-810.
3
[The use of the expectation-maximization (EM) algorithm for maximum likelihood estimation of gametic frequencies of multilocus polymorphic codominant systems based on sampled population data].[基于抽样群体数据,使用期望最大化(EM)算法对多位点共显性系统的配子频率进行最大似然估计]
Genetika. 2002 Mar;38(3):407-18.
4
Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.通过针对未分型二倍体基因型数据的期望最大化算法,对等位基因位点单倍型频率估计的准确性。
Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22.
5
Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.二倍体群体中分子单倍型频率的最大似然估计
Mol Biol Evol. 1995 Sep;12(5):921-7. doi: 10.1093/oxfordjournals.molbev.a040269.
6
Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region.患者群体中的单倍型频率估计:偏离哈迪-温伯格比例以及HLA区域中一个基因座上的合并的影响。
Genet Epidemiol. 2002 Feb;22(2):186-95. doi: 10.1002/gepi.0163.
7
Estimation of linkage disequilibrium for loci with multiple alleles: basic approach and an application using data from bighorn sheep.多等位基因位点的连锁不平衡估计:基本方法及使用大角羊数据的应用
Heredity (Edinb). 2001 Dec;87(Pt 6):698-708. doi: 10.1046/j.1365-2540.2001.00966.x.
8
Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data.利用混合DNA数据估计单倍型频率、连锁不平衡度量以及每个混合样本中单倍型拷贝的组合。
Am J Hum Genet. 2003 Feb;72(2):384-98. doi: 10.1086/346116. Epub 2003 Jan 17.
9
Expectation maximization algorithm based haplotype relative risk (EM-HRR): test of linkage disequilibrium using incomplete case-parents trios.基于期望最大化算法的单倍型相对风险(EM-HRR):使用不完全病例-父母三联体检验连锁不平衡
Hum Hered. 2005;59(3):125-35. doi: 10.1159/000085571. Epub 2005 May 2.
10
The loss of statistical power to distinguish populations when certain samples are ambiguous.当某些样本不明确时,区分总体的统计检验力丧失。
Theor Popul Biol. 2003 Sep;64(2):177-92. doi: 10.1016/s0040-5809(03)00084-4.

引用本文的文献

1
Genetic analysis of Schistosoma mansoni in a low-transmission area in Brazil suggests population sharing between wild-hosts and humans and geographical isolation.对巴西一个低传播地区的曼氏血吸虫进行的基因分析表明,野生宿主和人类之间存在种群共享以及地理隔离。
PLoS Negl Trop Dis. 2025 Aug 11;19(8):e0013379. doi: 10.1371/journal.pntd.0013379. eCollection 2025 Aug.
2
Population genetic analysis of the liver fluke Fasciola hepatica in German dairy cattle reveals high genetic diversity and associations with fluke size.对德国奶牛中肝片吸虫(Fasciola hepatica)的群体遗传学分析显示,其具有高度的遗传多样性以及与吸虫大小的相关性。
Parasit Vectors. 2025 Feb 13;18(1):51. doi: 10.1186/s13071-025-06701-6.
3
Population genetic structure of Aedes aegypti subspecies in selected geographical locations in Sudan.
苏丹部分地理位置上埃及伊蚊亚种的群体遗传结构
Sci Rep. 2024 Feb 5;14(1):2978. doi: 10.1038/s41598-024-52591-6.
4
Genome-wide association study reveals GmFulb as candidate gene for maturity time and reproductive length in soybeans (Glycine max).全基因组关联研究揭示 GmFulb 是大豆(Glycine max)成熟时间和生殖长度的候选基因。
PLoS One. 2024 Jan 19;19(1):e0294123. doi: 10.1371/journal.pone.0294123. eCollection 2024.
5
Application of an Improved 2-Dimensional High-Throughput Soybean Root Phenotyping Platform to Identify Novel Genetic Variants Regulating Root Architecture Traits.应用改进的二维高通量大豆根系表型分析平台鉴定调控根系结构性状的新基因变异
Plant Phenomics. 2023 Sep 28;5:0097. doi: 10.34133/plantphenomics.0097. eCollection 2023.
6
Multiscale landscape genetic analysis identifies major waterways as a barrier to dispersal of feral pigs in north Queensland, Australia.多尺度景观遗传分析表明,主要水道是澳大利亚昆士兰州北部野猪扩散的障碍。
Ecol Evol. 2023 Sep 28;13(10):e10575. doi: 10.1002/ece3.10575. eCollection 2023 Oct.
7
The application of short and highly polymorphic microhaplotype loci in paternity testing and sibling testing of temperature-dependent degraded samples.短串联高度多态性微单倍型基因座在温度依赖性降解样本亲子鉴定和同胞鉴定中的应用。
Front Genet. 2022 Sep 26;13:983811. doi: 10.3389/fgene.2022.983811. eCollection 2022.
8
Population genetics of Anopheles arabiensis, the primary malaria vector in the Republic of Sudan.阿拉伯按蚊的种群遗传学,苏丹共和国的主要疟疾传播媒介。
Malar J. 2021 Dec 19;20(1):469. doi: 10.1186/s12936-021-03994-7.
9
A linkage disequilibrium-based approach to position unmapped SNPs in crop species.基于连锁不平衡的方法定位作物物种中的未定位 SNP。
BMC Genomics. 2021 Oct 29;22(1):773. doi: 10.1186/s12864-021-08116-w.
10
Does stress mess with rodents' heads? Influence of habitat amount and genetic factors in mandible fluctuating asymmetry in South American water rats (, Sigmodontinae) from Brazilian Atlantic rainforest remnants.压力会影响啮齿动物的大脑吗?巴西大西洋雨林残余地南美水鼠(Sigmodontinae)栖息地数量和遗传因素对下颌骨波动不对称性的影响
Ecol Evol. 2021 May 2;11(11):7080-7092. doi: 10.1002/ece3.7557. eCollection 2021 Jun.