Schulze Thomas G, Chen Yu-Sheng, Akula Nirmala, Hennessy Kathleen, Badner Judith A, McInnis Melvin G, DePaulo J Raymond, Schumacher Johannes, Cichon Sven, Propping Peter, Maier Wolfgang, Rietschel Marcella, Nöthen Markus M, McMahon Francis J
Department of Psychiatry, The University of Chicago, Chicago, IL 60637, USA.
Hum Mol Genet. 2002 Jun 1;11(12):1363-72. doi: 10.1093/hmg/11.12.1363.
The distribution of linkage disequilibrium (LD) across the genome is highly complex. Little is known about the relationship between long-range and short-range LD in a genomic region. We assessed whether a dense set of microsatellite data could be used to predict short-range LD in family samples. We analyzed intermarker LD in data derived from chromosomal regions 18q22 and 10q25-26, densely genotyped with microsatellite markers. The pattern of LD was highly heterogeneous within and between both chromosomal regions. On 10q25-26, very little LD was detected. On 18q22, where marker density was higher, many marker pairs were in LD. We modeled the decay of LD over distance in this region. A classical model accounted for most of the relationship between LD and distance (R (2)=63%). We used this model to predict the proportion of markers expected to show useful levels of LD at short distances. This prediction agreed with estimates based on single-nucleotide polymorphism (SNP) marker genotypes in the region. Both microsatellite and SNP data predict that about 80% of marker pairs would display levels of LD that are useful for association studies at distances of up to 15 kb in this region. These projections also agree with levels of LD directly measured in a 10 kb set of SNP genotypes generated in a nearby region of finished sequence. Our results suggest that existing sets of microsatellite data, if sufficiently dense, may be used to develop good initial estimates of the density of additional markers needed to screen a region for disease alleles by association analysis.
全基因组连锁不平衡(LD)的分布极为复杂。对于基因组区域内长程和短程LD之间的关系,我们了解甚少。我们评估了一组密集的微卫星数据是否可用于预测家系样本中的短程LD。我们分析了来自染色体区域18q22和10q25 - 26的数据中的标记间LD,这些区域用微卫星标记进行了密集基因分型。两个染色体区域内部和之间的LD模式高度异质。在10q25 - 26上,检测到的LD极少。在18q22上,标记密度较高,许多标记对存在LD。我们对该区域LD随距离的衰减进行了建模。一个经典模型解释了LD与距离之间的大部分关系(R² = 63%)。我们使用这个模型来预测在短距离内预期显示有用LD水平的标记比例。这个预测与基于该区域单核苷酸多态性(SNP)标记基因型的估计一致。微卫星和SNP数据都预测,在该区域中,高达15 kb的距离内,约80%的标记对将显示对关联研究有用的LD水平。这些预测也与在附近已完成序列区域生成的一组10 kb SNP基因型中直接测量的LD水平一致。我们的结果表明,如果现有微卫星数据集足够密集,可用于对通过关联分析筛选区域疾病等位基因所需的额外标记密度进行良好的初步估计。