VanLiere Jenna M, Rosenberg Noah A
Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI 48109, USA.
Theor Popul Biol. 2008 Aug;74(1):130-7. doi: 10.1016/j.tpb.2008.05.006. Epub 2008 Jun 1.
Statistics for linkage disequilibrium (LD), the non-random association of alleles at two loci, depend on the frequencies of the alleles at the loci under consideration. Here, we examine the r(2) measure of LD and its mathematical relationship to allele frequencies, quantifying the constraints on its maximum value. Assuming independent uniform distributions for the allele frequencies of two biallelic loci, we find that the mean maximum value of r(2) is approximately 0.43051, and that r(2) can exceed a threshold of 4/5 in only approximately 14.232% of the allele frequency space. If one locus is assumed to have known allele frequencies--the situation in an association study in which LD between a known marker locus and an unknown trait locus is of interest--we find that the mean maximum value of r(2) is greatest when the known locus has a minor allele frequency of approximately 0.30131. We find that in 1/4 of the space of allowed values of minor allele frequencies and haplotype frequencies at a pair of loci, the unconstrained maximum r(2) allowing for the possibility of recombination between the loci exceeds the constrained maximum assuming that no recombination has occurred. Finally, we use r(max)(2) to examine the connection between r(2) and the D(') measure of linkage disequilibrium, finding that r(2)/r(max)(2)=D('2) for approximately 72.683% of the space of allowed values of (p(a),p(b),p(ab)). Our results concerning the properties of r(2) have the potential to inform the interpretation of unusual LD behavior and to assist in the design of LD-based association-mapping studies.
连锁不平衡(LD)统计,即两个位点上等位基因的非随机关联,取决于所考虑位点上等位基因的频率。在此,我们研究了LD的r(2)度量及其与等位基因频率的数学关系,量化了对其最大值的限制。假设两个双等位基因位点的等位基因频率服从独立均匀分布,我们发现r(2)的平均最大值约为0.43051,并且r(2)仅在约14.232%的等位基因频率空间中能超过4/5的阈值。如果假设一个位点具有已知的等位基因频率——这是关联研究中的情况,其中感兴趣的是已知标记位点与未知性状位点之间的LD——我们发现当已知位点的次要等位基因频率约为0.30131时,r(2)的平均最大值最大。我们发现,在一对位点的次要等位基因频率和单倍型频率的允许值空间的1/4中,考虑到位点之间可能发生重组时r(2)的无约束最大值超过了假设未发生重组时的约束最大值。最后,我们使用r(max)(2)来研究r(2)与连锁不平衡的D(')度量之间的联系,发现在(p(a),p(b),p(ab))允许值空间的约72.683%中,r(2)/r(max)(2)=D('2)。我们关于r(2)性质的结果有可能为异常LD行为的解释提供信息,并有助于基于LD的关联作图研究的设计。