Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
BMC Genomics. 2010 Jun 17;11:383. doi: 10.1186/1471-2164-11-383.
Tiling arrays have been the tool of choice for probing an organism's transcriptome without prior assumptions about the transcribed regions, but RNA-Seq is becoming a viable alternative as the costs of sequencing continue to decrease. Understanding the relative merits of these technologies will help researchers select the appropriate technology for their needs.
Here, we compare these two platforms using a matched sample of poly(A)-enriched RNA isolated from the second larval stage of C. elegans. We find that the raw signals from these two technologies are reasonably well correlated but that RNA-Seq outperforms tiling arrays in several respects, notably in exon boundary detection and dynamic range of expression. By exploring the accuracy of sequencing as a function of depth of coverage, we found that about 4 million reads are required to match the sensitivity of two tiling array replicates. The effects of cross-hybridization were analyzed using a "nearest neighbor" classifier applied to array probes; we describe a method for determining potential "black list" regions whose signals are unreliable. Finally, we propose a strategy for using RNA-Seq data as a gold standard set to calibrate tiling array data. All tiling array and RNA-Seq data sets have been submitted to the modENCODE Data Coordinating Center.
Tiling arrays effectively detect transcript expression levels at a low cost for many species while RNA-Seq provides greater accuracy in several regards. Researchers will need to carefully select the technology appropriate to the biological investigations they are undertaking. It will also be important to reconsider a comparison such as ours as sequencing technologies continue to evolve.
在不预先假设转录区域的情况下,平铺阵列一直是探测生物体转录组的首选工具,但随着测序成本的持续降低,RNA-Seq 正成为一种可行的替代方法。了解这些技术的相对优势将有助于研究人员根据自己的需求选择合适的技术。
在这里,我们使用从秀丽隐杆线虫的第二幼虫期分离的聚(A)富集 RNA 的匹配样本比较了这两种平台。我们发现这两种技术的原始信号具有很好的相关性,但 RNA-Seq 在几个方面优于平铺阵列,特别是在外显子边界检测和表达动态范围方面。通过探索测序的准确性作为覆盖深度的函数,我们发现大约需要 400 万次读取才能匹配两个平铺阵列重复的灵敏度。使用应用于阵列探针的“最近邻”分类器分析了交叉杂交的影响;我们描述了一种确定信号不可靠的潜在“黑名单”区域的方法。最后,我们提出了一种使用 RNA-Seq 数据作为校准平铺阵列数据的金标准集的策略。所有平铺阵列和 RNA-Seq 数据集都已提交给 modENCODE 数据协调中心。
平铺阵列在许多物种中以低成本有效地检测转录表达水平,而 RNA-Seq 在几个方面提供了更高的准确性。研究人员需要仔细选择适合他们正在进行的生物学研究的技术。随着测序技术的不断发展,重新考虑我们这样的比较也将很重要。