John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America.
PLoS One. 2009 Dec 14;4(12):e8232. doi: 10.1371/journal.pone.0008232.
Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or approximately 180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of > or = 3, 86% at a read depth of > or = 10, and over 50% of all targets were covered with > or = 20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at > or = 10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered > or = 8x. Our results offer guidance for "real-world" applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.
在未来几年,下一代测序(NGS)在人类遗传学研究中的有效利用将在很大程度上依赖于有效机制,以选择性地富集基因组中感兴趣的区域。最近,综合外显子捕获阵列已可用于靶向人类基因组中约 33Mb 或约 180000 个编码外显子。人类外显子的选择性基因组富集为新的实验设计提供了一个有吸引力的选择,旨在快速识别潜在的与疾病相关的遗传变异,特别是在基于家族的研究中。我们评估了来自一个三代家族系谱的八个人的 2.1M 特征人类外显子捕获阵列。我们能够在长读序列读取深度>或=3 时覆盖高达 98%的目标碱基,在读取深度>或=10 时覆盖高达 86%,并且超过 50%的所有目标都覆盖了>或=20 个读取。我们确定了每个个体外显子中高达 14284 个 SNP 和小的插入缺失,其中高达 1679 个代表潜在的新多态性。应用保守的基因型调用方法 HCDiff,基于 Illumina 1M BeadChips 基因型,每个个体外显子中变体等位基因的平均检测率为>或=10x 序列时为 95.2%。此外,我们提出了一种针对低覆盖目标的有利基因型调用策略,该策略根据现有基因型数据在给定覆盖深度下经验性地确定截止阈值。应用这种方法能够检测到>99%的>或=8x 覆盖的 SNP。我们的结果为人类遗传学中的“实际应用”提供了指导,并进一步证明基于微阵列的外显子捕获是富集下一代测序实验中感兴趣的染色体区域的有效且可靠的方法。