Maniatis N, Collins A, Xu C F, McCarthy L C, Hewett D R, Tapper W, Ennis S, Ke X, Morton N E
Human Genetics Division, University of Southampton, Southampton General Hospital, Southampton SO16 6YD, United Kingdom.
Proc Natl Acad Sci U S A. 2002 Feb 19;99(4):2228-33. doi: 10.1073/pnas.042680999. Epub 2002 Feb 12.
Linkage disequilibrium (LD) provides information about positional cloning, linkage, and evolution that cannot be inferred from other evidence, even when a correct sequence and a linkage map based on more than a handful of families become available. We present theory to construct an LD map for which distances are additive and population-specific maps are expected to be approximately proportional. For this purpose, there is only a modest difference in relative efficiency of haplotypes and diplotypes: resolving the latter into 2-locus haplotypes has significant cost or error and increases information by about 50%. LD maps for a cold spot in 19p13.3 and a more typical region in 3q21 are optimized by interval estimates. For a random sample and trustworthy map the value of LD at large distance can be predicted reliably from information over a small distance and does not depend on the evolutionary variance unless the sample size approaches the population size. Values of the association probability that can be distinguished from the value at large distance are determined not by population size but by time since a critical bottleneck. In these examples, omission of markers with significant Hardy-Weinberg disequilibrium does not improve the map, and widely discrepant draft sequences have similar estimates of the genetic parameters. The LD cold spot in 19p13.3 gives an unusually high estimate of time, supporting an argument that this relationship is general. As predicted for a region with ancient haplotypes or uniformly high recombination, there is no clear evidence of LD clustering. On the contrary, the 3q21 region is resolved into alternating blocks of stable and decreasing LD, as expected from crossover clustering. Construction of a genomewide LD map requires data not yet available, which may be complemented but not replaced by a catalog of haplotypes.
连锁不平衡(LD)提供了有关位置克隆、连锁和进化的信息,这些信息无法从其他证据中推断出来,即使有了正确的序列和基于多个家族的连锁图谱。我们提出了构建LD图谱的理论,该图谱的距离是可加的,并且特定群体的图谱预计大致成比例。为此,单倍型和双倍型的相对效率只有适度差异:将后者解析为双位点单倍型有显著成本或误差,且信息增加约50%。通过区间估计对19p13.3中的一个冷点和3q21中一个更典型区域的LD图谱进行了优化。对于随机样本和可靠图谱,远距离处的LD值可以从小距离处的信息可靠预测,并且不依赖于进化方差,除非样本量接近群体大小。可与远距离处的值区分开的关联概率值不是由群体大小决定,而是由自关键瓶颈以来的时间决定。在这些例子中,省略具有显著哈迪-温伯格不平衡的标记并不会改善图谱,并且差异很大的草图序列对遗传参数的估计相似。19p13.3中的LD冷点给出了异常高的时间估计,支持了这种关系具有普遍性的观点。正如对具有古老单倍型或均匀高重组区域的预测,没有明显的LD聚类证据。相反,3q21区域如交叉聚类所预期的那样被解析为稳定和递减LD交替的区块。构建全基因组LD图谱需要尚未获得的数据,这些数据可能会得到单倍型目录的补充,但不能被其取代。