INRA, UR875 Unité de Biométrie et Intelligence Artificielle, Chemin de Borde Rouge, Castanet-Tolosan, France.
Heredity (Edinb). 2012 Mar;108(3):285-91. doi: 10.1038/hdy.2011.73. Epub 2011 Aug 31.
Among the several linkage disequilibrium measures known to capture different features of the non-independence between alleles at different loci, the most commonly used for diallelic loci is the r(2) measure. In the present study, we tackled the problem of the bias of r(2) estimate, which results from the sample structure and/or the relatedness between genotyped individuals. We derived two novel linkage disequilibrium measures for diallelic loci that are both extensions of the usual r(2) measure. The first one, r(S)(2), uses the population structure matrix, which consists of information about the origins of each individual and the admixture proportions of each individual genome. The second one, r(V)(2), includes the kinship matrix into the calculation. These two corrections can be applied together in order to correct for both biases and are defined either on phased or unphased genotypes.We proved that these novel measures are linked to the power of association tests under the mixed linear model including structure and kinship corrections. We validated them on simulated data and applied them to real data sets collected on Vitis vinifera plants. Our results clearly showed the usefulness of the two corrected r(2) measures, which actually captured 'true' linkage disequilibrium unlike the usual r(2) measure.
在已知的几种用于捕捉不同位置等位基因之间非独立性的连锁不平衡度量中,最常用于二倍体基因座的是 r(2)度量。在本研究中,我们解决了 r(2)估计的偏差问题,该偏差是由样本结构和/或个体之间的亲缘关系引起的。我们推导出了两种新的二倍体基因座连锁不平衡度量,它们都是通常的 r(2)度量的扩展。第一个是 r(S)(2),它使用了群体结构矩阵,该矩阵包含了每个个体的起源和每个个体基因组的混合比例的信息。第二个是 r(V)(2),它将亲缘关系矩阵纳入了计算。这两种校正可以一起应用,以校正两种偏差,并且可以在有相位或无相位基因型上定义。我们证明了这些新的度量与包括结构和亲缘关系校正的混合线性模型下的关联测试的功效相关。我们在模拟数据上验证了它们,并将它们应用于在葡萄植物上收集的真实数据集。我们的结果清楚地表明了这两种校正 r(2)度量的有用性,它们实际上捕捉到了“真正的”连锁不平衡,而不像通常的 r(2)度量那样。