Richterich P
Genome Therapeutics Corp., Waltham, Massachusetts 02154, USA.
Genome Res. 1998 Mar;8(3):251-9. doi: 10.1101/gr.8.3.251.
As DNA sequencing is performed more and more in a mass-production-like manner, efficient quality control measures become increasingly important for process control, but so also does the ability to compare different methods and projects. One of the fundamental quality measures in sequencing projects is the position-specific error probability at all bases in each individual sequence. Accurate prediction of base-specific error rates from "raw" sequence data would allow immediate quality control as well as benchmarking different methods and projects while avoiding the inefficiencies and time delays associated with resequencing and assessments after "finishing" a sequence. The program PHRED provides base-specific quality scores that are logarythmically related to error probabilities. This study assessed the accuracy of PHRED's error-rate prediction by analyzing sequencing projects from six different large-scale sequencing laboratories. All projects used four-color fluorescent sequencing, but the sequencing methods used varied widely between the different projects. The results indicate that the error-rate predictions such as those given by PHRED can be highly accurate for a large variety of different sequencing methods as well as over a wide range of sequence quality.
随着DNA测序越来越以大规模生产的方式进行,有效的质量控制措施对于过程控制变得越来越重要,比较不同方法和项目的能力也是如此。测序项目中的一项基本质量指标是每个单独序列中所有碱基的位置特异性错误概率。从“原始”序列数据准确预测碱基特异性错误率将允许即时质量控制以及对不同方法和项目进行基准测试,同时避免与完成序列后的重新测序和评估相关的低效率和时间延迟。PHRED程序提供与错误概率呈对数关系的碱基特异性质量分数。本研究通过分析来自六个不同大规模测序实验室的测序项目,评估了PHRED错误率预测的准确性。所有项目均使用四色荧光测序,但不同项目之间使用的测序方法差异很大。结果表明,诸如PHRED给出的错误率预测对于各种不同的测序方法以及广泛的序列质量范围都可以非常准确。