Nat Biotechnol. 2014 Sep;32(9):903-14. doi: 10.1038/nbt.2957. Epub 2014 Aug 24.
We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
我们展示了由美国食品药品监督管理局协调开展的测序质量控制(SEQC)项目的初步结果。我们在多个实验室站点使用带有内置对照的参考RNA样本,对Illumina HiSeq、Life Technologies SOLiD和罗氏454平台进行检测,评估RNA测序(RNA-seq)在接头发现和差异表达谱分析方面的性能,并使用互补指标将其与微阵列和定量PCR(qPCR)数据进行比较。在所有测序深度下,我们发现了未注释的外显子-外显子接头,其中超过80%经qPCR验证。我们发现,如果使用特定的过滤条件,相对表达的测量在不同站点和平台之间是准确且可重复的。相比之下,RNA-seq和微阵列不能提供准确的绝对测量值,并且在所有检测平台(包括qPCR)中都观察到了基因特异性偏差。测量性能取决于平台和数据分析流程,并且转录本水平分析的变化很大。完整的SEQC数据集包含超过1000亿条读数(10Tb),为评估临床和监管环境下的RNA-seq分析提供了独特的资源。