Erlich Yaniv, Mitra Partha P, delaBastide Melissa, McCombie W Richard, Hannon Gregory J
Watson School of Biological Sciences, 1 Bungtown Road, Cold Spring Harbor, New York 11724, USA.
Nat Methods. 2008 Aug;5(8):679-82. doi: 10.1038/nmeth.1230. Epub 2008 Jul 6.
Next-generation sequencing is limited to short read lengths and by high error rates. We systematically analyzed sources of noise in the Illumina Genome Analyzer that contribute to these high error rates and developed a base caller, Alta-Cyclic, that uses machine learning to compensate for noise factors. Alta-Cyclic substantially improved the number of accurate reads for sequencing runs up to 78 bases and reduced systematic biases, facilitating confident identification of sequence variants.
下一代测序技术受限于短读长和高错误率。我们系统地分析了Illumina基因组分析仪中导致这些高错误率的噪声来源,并开发了一种碱基识别器Alta-Cyclic,它利用机器学习来补偿噪声因素。Alta-Cyclic显著提高了长达78个碱基的测序运行中准确读取的数量,并减少了系统偏差,有助于可靠地识别序列变异。