Department of Genetics, Stanford University, Stanford, California, USA.
Nat Biotechnol. 2011 Dec 18;30(1):78-82. doi: 10.1038/nbt.2065.
Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of ∼76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.1% of the ∼3.7 million unique SNVs were concordant between platforms, there were tens of thousands of platform-specific calls located in genes and other genomic regions. In contrast, 26.5% of indels were concordant between platforms. Target enrichment validated 92.7% of the concordant SNVs, whereas validation by genotyping array revealed a sensitivity of 99.3%. The validation experiments also suggested that >60% of the platform-specific variants were indeed present in the genome. Our results have important implications for understanding the accuracy and completeness of the genome sequencing platforms.
全基因组测序正变得越来越普遍,但最广泛使用的 Illumina 和 Complete Genomics 平台在变异调用方面的准确性和完整性尚未得到报道。在这里,我们使用这两种技术对一个个体的基因组进行了平均约 76 倍的高覆盖测序,并比较了它们在序列覆盖度和单核苷酸变异(SNV)、插入和缺失(indel)的调用方面的性能。尽管约 370 万个独特的 SNV 中有 88.1%在平台间是一致的,但在基因和其他基因组区域存在数以万计的平台特异性的变异。相比之下,平台间有 26.5%的 indel 是一致的。靶向富集验证了 92.7%的一致 SNV,而基因分型阵列的验证则显示了 99.3%的敏感性。验证实验还表明,>60%的平台特异性变异确实存在于基因组中。我们的研究结果对理解基因组测序平台的准确性和完整性具有重要意义。