Department of Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.
Hum Mutat. 2009 Dec;30(12):1703-12. doi: 10.1002/humu.21122.
We evaluated massive parallel sequencing and long-range PCR (LRP) for rare variant detection and allele frequency estimation in pooled DNA samples. Exons 2 to 16 of the MUTYH gene were analyzed in breast cancer patients with Illumina's (Solexa) technology. From a pool of 287 genomic DNA samples we generated a single LRP product, while the same LRP was performed on 88 individual samples and the resulting products then pooled. Concentrations of constituent samples were measured with fluorimetry for genomic DNA and high-resolution melting curve analysis (HR-MCA) for LRP products. Illumina sequencing results were compared to Sanger sequencing data of individual samples. Correlation between allele frequencies detected by both methods was poor in the first pool, presumably because the genomic samples amplified unequally in the LRP, due to DNA quality variability. In contrast, allele frequencies correlated well in the second pool, in which all expected alleles at a frequency of 1% and higher were reliably detected, plus the majority of singletons (0.6% allele frequency). We describe custom bioinformatics and statistics to optimize detection of rare variants and to estimate required sequencing depth. Our results provide directions for designing high-throughput analyses of candidate genes.
我们评估了大规模平行测序和长距离 PCR(LRP)在混合 DNA 样本中稀有变异检测和等位基因频率估计的应用。采用 Illumina(Solexa)技术对乳腺癌患者 MUTYH 基因的 2 至 16 号外显子进行分析。我们从 287 个基因组 DNA 样本中生成了一个 LRP 产物,而对 88 个个体样本进行了相同的 LRP,并将得到的产物进行混合。通过荧光法测量组成样本的基因组 DNA 浓度,通过高分辨率熔解曲线分析(HR-MCA)测量 LRP 产物的浓度。将 Illumina 测序结果与个体样本的 Sanger 测序数据进行比较。在第一个混合样本中,两种方法检测到的等位基因频率相关性较差,可能是由于 LRP 中基因组样本扩增不均,这是由于 DNA 质量的可变性所致。相比之下,在第二个混合样本中,所有预期频率为 1%及以上的等位基因都能可靠地检测到,并且还检测到了大多数单倍体(频率为 0.6%的等位基因),其等位基因频率相关性良好。我们描述了定制的生物信息学和统计学方法,以优化稀有变异的检测并估计所需的测序深度。我们的结果为设计候选基因的高通量分析提供了方向。