Rannala B, Reeve J P
Department of Medical Genetics, University of Alberta, Edmonton, Alberta T6G 2H7, Canada.
Am J Hum Genet. 2001 Jul;69(1):159-78. doi: 10.1086/321279. Epub 2001 Jun 15.
A new method is presented for fine-scale linkage disequilibrium (LD) mapping of a disease mutation; it uses multiple linked single-nucleotide polymorphisms, restriction-fragment-length polymorphisms, or microsatellite markers and incorporates information from an annotated human genome sequence (HGS) and from a human mutation database. The method takes account of population demographic effects, using Markov chain Monte Carlo methods to integrate over the unknown gene genealogy and gene coalescence times. Information about the relative frequency of disease mutations in exons, introns, and other regions, from mutational databases, as well as assumptions about the completeness of the gene annotation, are used with an annotated HGS, to generate a prior probability that a mutation lies at any particular position in a specified region of the genome. This information is updated with information about mutation location, from LD at a set of linked markers in the region, to generate the posterior probability density of the mutation location. The performance of the method is evaluated by simulation and by analysis of a data set for diastrophic dysplasia (DTD) in Finland. The DTD disease gene has been positionally cloned, so the actual location of the mutation is known and can be compared with the position predicted by our method. For the DTD data, the addition of information from an HGS results in disease-gene localization at a resolution that is much higher than that which would be possible by LD mapping alone. In this case, the gene would be found by sequencing a region < or =7 kb in size.
本文提出了一种用于疾病突变精细尺度连锁不平衡(LD)定位的新方法;该方法使用多个连锁的单核苷酸多态性、限制性片段长度多态性或微卫星标记,并整合来自注释人类基因组序列(HGS)和人类突变数据库的信息。该方法考虑了群体人口统计学效应,使用马尔可夫链蒙特卡罗方法对未知的基因谱系和基因合并时间进行整合。来自突变数据库的关于外显子、内含子和其他区域疾病突变相对频率的信息,以及关于基因注释完整性的假设,与注释的HGS一起使用,以生成突变位于基因组指定区域任何特定位置的先验概率。该信息通过该区域一组连锁标记处LD的突变位置信息进行更新,以生成突变位置的后验概率密度。通过模拟和对芬兰脊柱骨骺发育不良(DTD)数据集的分析来评估该方法的性能。DTD疾病基因已通过定位克隆,因此突变的实际位置是已知的,可以与我们方法预测的位置进行比较。对于DTD数据,添加来自HGS的信息导致疾病基因定位的分辨率比仅通过LD定位可能达到的分辨率高得多。在这种情况下,通过对大小≤7 kb的区域进行测序就能找到该基因。