Waardenberg Ashley J, Field Matthew A
Australian Institute for Tropical Health and Medicine, Centre for Tropical Bioinformatics and Molecular Biology, Centre for Molecular Therapeutics, James Cook University, Smithfield, Australia.
John Curtin School of Medical Research, Australian National University, Canberra, Australia.
PeerJ. 2019 Dec 13;7:e8206. doi: 10.7717/peerj.8206. eCollection 2019.
Extensive evaluation of RNA-seq methods have demonstrated that no single algorithm consistently outperforms all others. Removal of unwanted variation (RUV) has also been proposed as a method for stabilizing differential expression (DE) results. Despite this, it remains a challenge to run multiple RNA-seq algorithms to identify significant differences common to multiple algorithms, whilst also integrating and assessing the impact of RUV into all algorithms. consensusDE was developed to automate the process of identifying significant DE by combining the results from multiple algorithms with minimal user input and with the option to automatically integrate RUV. consensusDE only requires a table describing the sample groups, a directory containing BAM files or preprocessed count tables and an optional transcript database for annotation. It supports merging of technical replicates, paired analyses and outputs a compendium of plots to guide the user in subsequent analyses. Herein, we assess the ability of RUV to improve DE stability when combined with multiple algorithms and between algorithms, through application to real and simulated data. We find that, although RUV increased fold change stability between algorithms, it demonstrated improved FDR in a setting of low replication for the intersect, the effect was algorithm specific and diminished with increased replication, reinforcing increased replication for recovery of true DE genes. We finish by offering some rules and considerations for the application of RUV in a consensus-based setting. consensusDE is freely available, implemented in R and available as a Bioconductor package, under the GPL-3 license, along with a comprehensive vignette describing functionality: http://bioconductor.org/packages/consensusDE/.
对RNA测序方法的广泛评估表明,没有一种算法能始终优于其他所有算法。去除不必要的变异(RUV)也已被提议作为一种稳定差异表达(DE)结果的方法。尽管如此,运行多种RNA测序算法以识别多种算法共有的显著差异,同时将RUV的影响整合并评估到所有算法中,仍然是一个挑战。consensusDE的开发是为了通过结合多种算法的结果,以最少的用户输入并可选择自动整合RUV,来自动识别显著的差异表达。consensusDE只需要一个描述样本组的表格、一个包含BAM文件或预处理计数表格的目录以及一个用于注释的可选转录本数据库。它支持合并技术重复样本、配对分析,并输出一组图表以指导用户进行后续分析。在此,我们通过应用于真实数据和模拟数据,评估RUV与多种算法结合时以及在不同算法之间提高差异表达稳定性的能力。我们发现,虽然RUV提高了不同算法之间的倍数变化稳定性,但在低重复情况下,对于交集部分它显示出改进的错误发现率(FDR),这种效果因算法而异,并且随着重复次数的增加而减弱,这进一步强调了增加重复次数以恢复真正的差异表达基因。我们最后提供了一些在基于共识的设置中应用RUV的规则和注意事项。consensusDE可免费获取,用R语言实现,作为一个生物导体包发布,遵循GPL - 3许可协议,同时还有一个描述其功能的全面 vignette:http://bioconductor.org/packages/consensusDE/ 。