Rapaport Franck, Khanin Raya, Liang Yupu, Pirun Mono, Krek Azra, Zumbo Paul, Mason Christopher E, Socci Nicholas D, Betel Doron
Genome Biol. 2013;14(9):R95. doi: 10.1186/gb-2013-14-9-r95.
A large number of computational methods have been developed for analyzing differential gene expression in RNA-seq data. We describe a comprehensive evaluation of common methods using the SEQC benchmark dataset and ENCODE data. We consider a number of key features, including normalization, accuracy of differential expression detection and differential expression analysis when one condition has no detectable expression. We find significant differences among the methods, but note that array-based methods adapted to RNA-seq data perform comparably to methods designed for RNA-seq. Our results demonstrate that increasing the number of replicate samples significantly improves detection power over increased sequencing depth.
已经开发了大量计算方法用于分析RNA测序数据中的差异基因表达。我们使用SEQC基准数据集和ENCODE数据对常用方法进行了全面评估。我们考虑了许多关键特征,包括归一化、差异表达检测的准确性以及在一种条件下无可检测表达时的差异表达分析。我们发现这些方法之间存在显著差异,但注意到适用于RNA测序数据的基于芯片的方法与为RNA测序设计的方法表现相当。我们的结果表明,增加重复样本数量比增加测序深度能显著提高检测能力。