Craig David W, Pearson John V, Szelinger Szabolcs, Sekar Aswin, Redman Margot, Corneveaux Jason J, Pawlowski Traci L, Laub Trisha, Nunn Gary, Stephan Dietrich A, Homer Nils, Huentelman Matthew J
The Translational Genomics Research Institute, Phoenix, Arizona 85004, USA.
Nat Methods. 2008 Oct;5(10):887-93. doi: 10.1038/nmeth.1251. Epub 2008 Sep 14.
We developed a generalized framework for multiplexed resequencing of targeted human genome regions on the Illumina Genome Analyzer using degenerate indexed DNA bar codes ligated to fragmented DNA before sequencing. Using this method, we simultaneously sequenced the DNA of multiple HapMap individuals at several Encyclopedia of DNA Elements (ENCODE) regions. We then evaluated the use of Bayes factors for discovering and genotyping polymorphisms. For polymorphisms that were either previously identified within the Single Nucleotide Polymorphism database (dbSNP) or visually evident upon re-inspection of archived ENCODE traces, we observed a false positive rate of 11.3% using strict thresholds for predicting variants and 69.6% for lax thresholds. Conversely, false negative rates were 10.8-90.8%, with false negatives at stricter cut-offs occurring at lower coverage (<10 aligned reads). These results suggest that >90% of genetic variants are discoverable using multiplexed sequencing provided sufficient coverage at the polymorphic base.
我们开发了一种通用框架,用于在Illumina基因组分析仪上对靶向人类基因组区域进行多重重测序,该方法是在测序前将简并索引DNA条形码连接到片段化DNA上。使用这种方法,我们同时对多个HapMap个体的DNA在几个DNA元件百科全书(ENCODE)区域进行了测序。然后,我们评估了使用贝叶斯因子来发现多态性并进行基因分型。对于那些先前在单核苷酸多态性数据库(dbSNP)中已鉴定出的或在重新检查存档的ENCODE痕迹时肉眼可见的多态性,我们观察到,使用严格阈值预测变异时假阳性率为11.3%,使用宽松阈值时为69.6%。相反,假阴性率为10.8 - 90.8%,在更严格的截止值下,较低覆盖度(<10条比对读数)时会出现假阴性。这些结果表明,使用多重测序,只要在多态性碱基处有足够的覆盖度,>90%的遗传变异是可发现的。