Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland.
PLoS One. 2009 Aug 17;4(8):e6659. doi: 10.1371/journal.pone.0006659.
Detection of the rare polymorphisms and causative mutations of genetic diseases in a targeted genomic area has become a major goal in order to understand genomic and phenotypic variability. We have interrogated repeat-masked regions of 8.9 Mb on human chromosomes 21 (7.8 Mb) and 7 (1.1 Mb) from an individual from the International HapMap Project (NA12872). We have optimized a method of genomic selection for high throughput sequencing. Microarray-based selection and sequencing resulted in 260-fold enrichment, with 41% of reads mapping to the target region. 83% of SNPs in the targeted region had at least 4-fold sequence coverage and 54% at least 15-fold. When assaying HapMap SNPs in NA12872, our sequence genotypes are 91.3% concordant in regions with coverage > or = 4-fold, and 97.9% concordant in regions with coverage > or = 15-fold. About 81% of the SNPs recovered with both thresholds are listed in dbSNP. We observed that regions with low sequence coverage occur in close proximity to low-complexity DNA. Validation experiments using Sanger sequencing were performed for 46 SNPs with 15-20 fold coverage, with a confirmation rate of 96%, suggesting that DNA selection provides an accurate and cost-effective method for identifying rare genomic variants.
为了理解基因组和表型的可变性,检测遗传疾病的罕见多态性和致病突变已成为一个主要目标。我们对来自国际人类基因组单体型图计划(HapMap)的个体(NA12872)的人类染色体 21(7.8 Mb)和 7(1.1 Mb)上的重复序列掩蔽区域进行了检测。我们优化了一种用于高通量测序的基因组选择方法。基于微阵列的选择和测序导致了 260 倍的富集,其中 41%的读取映射到目标区域。靶向区域中 83%的 SNP 至少有 4 倍的序列覆盖度,54%的 SNP 至少有 15 倍的序列覆盖度。在对 NA12872 中的 HapMap SNP 进行检测时,我们的序列基因型在覆盖度>或=4 倍的区域中有 91.3%的一致性,在覆盖度>或=15 倍的区域中有 97.9%的一致性。在这两个阈值下都能检测到的约 81%的 SNP 都列在 dbSNP 中。我们观察到低序列覆盖度的区域与低复杂度 DNA 紧密相邻。使用 Sanger 测序对 46 个具有 15-20 倍覆盖度的 SNP 进行了验证实验,确认率为 96%,这表明 DNA 选择是一种准确且具有成本效益的方法,可以用于识别罕见的基因组变体。