Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, 78712, TX, USA.
Department of Molecular Biosciences, The University of Texas at Austin, Austin, 78712, TX, USA.
BMC Genomics. 2018 Jul 3;19(1):510. doi: 10.1186/s12864-018-4869-5.
Alignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification.
We comprehensively tested and compared four RNA-seq pipelines for accuracy of gene quantification and fold-change estimation. We used a novel total RNA benchmarking dataset in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines consisted of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. We found that all pipelines showed high accuracy for quantifying the expression of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performance in quantifying lowly-abundant and small RNAs.
We have shown that alignment-free and traditional alignment-based quantification methods perform similarly for common gene targets, such as protein-coding genes. However, we have identified a potential pitfall in analyzing and quantifying lowly-expressed genes and small RNAs with alignment-free pipelines, especially when these small RNAs contain biological variations.
无比对 RNA 定量工具极大地提高了 RNA-seq 分析的速度。然而,在总 RNA 定量的情况下,这些最先进的 RNA-seq 分析管道是否能像长 RNA 一样准确地定量小 RNA 尚不清楚。
我们全面测试和比较了四个 RNA-seq 管道的基因定量和倍数变化估计的准确性。我们使用了一个新颖的总 RNA 基准数据集,其中高度代表了小非编码 RNA 以及其他长 RNA。这四个 RNA-seq 管道由两种常用的无比对管道和两种基于比对的管道变体组成。我们发现,所有管道在定量长且高度丰富的基因的表达方面都表现出很高的准确性。然而,无比对管道在定量低丰度和小 RNA 方面的表现系统较差。
我们已经表明,无比对和传统的基于比对的定量方法在常见的基因靶标(如编码蛋白的基因)上表现相似。然而,我们已经发现了一种在使用无比对管道分析和定量低表达基因和小 RNA 时的潜在陷阱,尤其是当这些小 RNA 包含生物学变异时。