McPeek M S, Strahs A
Department of Statistics, University of Chicago, Chicago, IL 60637, USA.
Am J Hum Genet. 1999 Sep;65(3):858-75. doi: 10.1086/302537.
Linkage disequilibrium (LD) is of great interest for gene mapping and the study of population history. We propose a multilocus model for LD, based on the decay of haplotype sharing (DHS). The DHS model is most appropriate when the LD in which one is interested is due to the introduction of a variant on an ancestral haplotype, with recombinations in succeeding generations resulting in preservation of only a small region of the ancestral haplotype around the variant. This is generally the scenario of interest for gene mapping by LD. The DHS parameter is a measure of LD that can be interpreted as the expected genetic distance to which the ancestral haplotype is preserved, or, equivalently, 1/(time in generations to the ancestral haplotype). The method allows for multiple origins of alleles and for mutations, and it takes into account missing observations and ambiguities in haplotype determination, via a hidden Markov model. Whereas most commonly used measures of LD apply to pairs of loci, the DHS measure is designed for application to the densely mapped haplotype data that are increasingly available. The DHS method explicitly models the dependence among multiple tightly linked loci on a chromosome. When the assumptions about population structure are sufficiently tractable, the estimate of LD is obtained by maximum likelihood. For more-complicated models of population history, we find means and covariances based on the model and solve a quasi-score estimating equation. Simulations show that this approach works extremely well both for estimation of LD and for fine mapping. We apply the DHS method to published data sets for cystic fibrosis and progressive myoclonus epilepsy.
连锁不平衡(LD)对于基因定位和群体历史研究具有重要意义。我们基于单倍型共享衰减(DHS)提出了一种用于连锁不平衡的多位点模型。当人们感兴趣的连锁不平衡是由于在祖先单倍型上引入一个变异,随后几代的重组仅导致在变异周围保留祖先单倍型的一小部分区域时,DHS模型最为适用。这通常是通过连锁不平衡进行基因定位所关注的情况。DHS参数是一种连锁不平衡度量,可解释为祖先单倍型被保留的预期遗传距离,或者等效地,为到祖先单倍型的世代时间的倒数(1 /(到祖先单倍型的世代时间))。该方法允许等位基因有多个起源以及存在突变,并且通过隐马尔可夫模型考虑了单倍型确定中的缺失观测值和模糊性。虽然最常用的连锁不平衡度量适用于成对的基因座,但DHS度量旨在应用于越来越多可用的高密度单倍型数据。DHS方法明确地对染色体上多个紧密连锁基因座之间的依赖性进行建模。当关于群体结构的假设足够易于处理时,通过最大似然法获得连锁不平衡的估计值。对于更复杂的群体历史模型,我们基于该模型找到均值和协方差,并求解一个拟得分估计方程。模拟表明,这种方法在连锁不平衡估计和精细定位方面都非常有效。我们将DHS方法应用于已发表的囊性纤维化和进行性肌阵挛癫痫的数据集。