Department of Statistics, Iowa State University, Snedecor Hall, Ames, Iowa 50011-1210, USA.
Am J Bot. 2012 Feb;99(2):248-56. doi: 10.3732/ajb.1100340. Epub 2012 Jan 20.
RNA-Seq technologies are quickly revolutionizing genomic studies, and statistical methods for RNA-seq data are under continuous development. Timely review and comparison of the most recently proposed statistical methods will provide a useful guide for choosing among them for data analysis. Particular interest surrounds the ability to detect differential expression (DE) in genes. Here we compare four recently proposed statistical methods, edgeR, DESeq, baySeq, and a method with a two-stage Poisson model (TSPM), through a variety of simulations that were based on different distribution models or real data. We compared the ability of these methods to detect DE genes in terms of the significance ranking of genes and false discovery rate control. All methods compared are implemented in freely available software. We also discuss the availability and functions of the currently available versions of these software.
RNA-Seq 技术正在迅速改变基因组学研究,RNA-seq 数据的统计方法也在不断发展。及时审查和比较最近提出的统计方法,将为数据分析时选择这些方法提供有用的指导。特别关注的是检测基因差异表达 (DE) 的能力。在这里,我们通过基于不同分布模型或真实数据的各种模拟,比较了最近提出的四种统计方法,edgeR、DESeq、baySeq 和具有两阶段泊松模型 (TSPM) 的方法。我们比较了这些方法在基因显著性排序和假发现率控制方面检测 DE 基因的能力。所有比较的方法都在免费提供的软件中实现。我们还讨论了这些软件当前版本的可用性和功能。