Spencer David H, Bubb Kerry L, Olson Maynard V
Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
Am J Hum Genet. 2006 Nov;79(5):958-64. doi: 10.1086/508757. Epub 2006 Sep 25.
Comparisons between haplotypes from affected patients and the human reference genome are frequently used to identify candidates for disease-causing mutations, even though these alignments are expected to reveal a high level of background neutral polymorphism. This limits the scope of genetic studies to relatively small genomic intervals, because current methods for distinguishing potential causal mutations from neutral variation are inefficient. Here we describe a new strategy for detecting mutations that is based on comparing affected haplotypes with closely matched control sequences from healthy individuals, rather than with the human reference genome. We use theory, simulation, and a real data set to show that this approach is expected to reduce the number of sequence variants that must be subjected to follow-up analysis by at least a factor of 20 when closely matched control sequences are selected from a reference panel with as few as 100 control genomes. We also define a reference data resource that would allow efficient application of this strategy to large critical intervals across the genome.
尽管预计这些比对会揭示出高水平的背景中性多态性,但受影响患者的单倍型与人类参考基因组之间的比较仍经常用于识别致病突变的候选基因。这将基因研究的范围限制在相对较小的基因组区间,因为目前区分潜在因果突变与中性变异的方法效率低下。在此,我们描述了一种检测突变的新策略,该策略基于将受影响的单倍型与来自健康个体的紧密匹配的对照序列进行比较,而不是与人类参考基因组进行比较。我们运用理论、模拟和一个真实数据集表明,当从一个参考面板中选择仅有100个对照基因组的紧密匹配对照序列时,这种方法有望将必须进行后续分析的序列变异数量至少减少20倍。我们还定义了一个参考数据资源,这将使该策略能够有效地应用于全基因组的大型关键区间。