Maniatis Nikolas, Collins Andrew, Gibson Jane, Zhang Weihua, Tapper William, Morton Newton E
Human Genetics Division, University of Southampton, Southampton SO16 6YD, United Kingdom.
Am J Hum Genet. 2004 May;74(5):846-55. doi: 10.1086/383589. Epub 2004 Mar 26.
Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Wald's likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured.
最近,已经开发出了为每个标记指定连锁不平衡(LD)单位(LDU)位置的度量连锁不平衡(LD)图谱(Maniatis等人,2002年)。在此,我们提出了一种在复合似然框架内通过LD进行定位克隆的多重成对方法,并研究了以物理单位(kb)和LDU表示的图谱对于作为当前模块概念基础的两组数据(Daly等人,2001年;Jeffreys等人,2001年)的操作特征。通过一次选择一个单核苷酸多态性(SNP)作为病因,并将其等位基因计数(三种基因型分别为0、1或2)作为假表型Y,来检查疾病位点的假阴性指示(II型错误)。通过回归和相关性分析,每个假表型与每个SNP位点(X)的等位基因计数之间的关联基于对Malecot模型的一种改编,该模型包括一个用于推定基因位置的参数。通过以kb或LDU表示位置,当拟合LDU图谱时观察到更高的定位能力。相对于LDU图谱,kb图谱描述LD的效率从最大值0.87到最小值0.36不等,平均值为0.62。通过模拟一个不连锁的病因SNP并将等位基因计数用作假表型,来检查疾病位点的假阳性指示(I型错误)。对于两种度量以及所有测试的模型,I型错误与Wald似然定理高度吻合。与仅选择最显著标记、单倍型或单倍型集的测试不同,这些方法对于候选区域中的大量标记具有鲁棒性。与保留单倍型多样性的标签SNP的预测相反,样本量较小但SNP密度较高的样本产生的错误较少。在模块和步长中,病因SNP的位置以相同的精度进行估计,这表明模块定义对于定位病因SNP可能不如预期的有用。这些结果为通过SNP进行高效定位克隆提供了指导,并为基于单倍型的替代方法的定位克隆能力提供了一个可衡量的基准。