Lynch Michael, Xu Sen, Maruki Takahiro, Jiang Xiaoqian, Pfaffelhuber Peter, Haubold Bernhard
Department of Biology, Indiana University, Bloomington, Indiana 47401.
Faculty of Mathematics and Physics, University of Freiburg, Freiburg 79104, Germany.
Genetics. 2014 Sep;198(1):269-81. doi: 10.1534/genetics.114.166843. Epub 2014 Jun 19.
Although the analysis of linkage disequilibrium (LD) plays a central role in many areas of population genetics, the sampling variance of LD is known to be very large with high sensitivity to numbers of nucleotide sites and individuals sampled. Here we show that a genome-wide analysis of the distribution of heterozygous sites within a single diploid genome can yield highly informative patterns of LD as a function of physical distance. The proposed statistic, the correlation of zygosity, is closely related to the conventional population-level measure of LD, but is agnostic with respect to allele frequencies and hence likely less prone to outlier artifacts. Application of the method to several vertebrate species leads to the conclusion that >80% of recombination events are typically resolved by gene-conversion-like processes unaccompanied by crossovers, with the average lengths of conversion patches being on the order of one to several kilobases in length. Thus, contrary to common assumptions, the recombination rate between sites does not scale linearly with distance, often even up to distances of 100 kb. In addition, the amount of LD between sites separated by <200 bp is uniformly much greater than can be explained by the conventional neutral model, possibly because of the nonindependent origin of mutations within this spatial scale. These results raise questions about the application of conventional population-genetic interpretations to LD on short spatial scales and also about the use of spatial patterns of LD to infer demographic histories.
尽管连锁不平衡(LD)分析在群体遗传学的许多领域中起着核心作用,但已知LD的抽样方差非常大,对核苷酸位点数量和抽样个体高度敏感。我们在此表明,对单个二倍体基因组内杂合位点分布进行全基因组分析,能够产生作为物理距离函数的、信息量丰富的LD模式。所提出的统计量——纯合度相关性,与传统的群体水平LD度量密切相关,但与等位基因频率无关,因此可能不太容易出现异常值伪像。将该方法应用于几种脊椎动物物种后得出的结论是,超过80%的重组事件通常由类似基因转换的过程解决,且不伴有交叉,转换片段的平均长度约为1至几千碱基对。因此,与常见假设相反,位点间的重组率并不随距离呈线性变化,甚至在长达100 kb的距离内也是如此。此外,相隔小于200 bp的位点间的LD量始终远大于传统中性模型所能解释的量,这可能是由于在此空间尺度内突变的非独立起源。这些结果对在短空间尺度上对LD应用传统群体遗传学解释以及利用LD的空间模式推断种群历史提出了疑问。