Division of Computational Biology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, 55905, USA.
Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA.
Genome Biol. 2024 Oct 30;25(1):282. doi: 10.1186/s13059-024-03230-w.
A recent study found severely inflated type I error rates for DESeq2 and edgeR, two dominant tools used for differential expression analysis of RNA-seq data. Here, we show that by properly addressing the outliers in the RNA-Seq data using winsorization, the type I error rate of DESeq2 and edgeR can be substantially reduced, and the power is comparable to Wilcoxon rank-sum test for large datasets. Therefore, as an alternative to Wilcoxon rank-sum test, they may still be applied for differential expression analysis of large RNA-Seq datasets.
最近的一项研究发现,DESeq2 和 edgeR 的 I 型错误率严重膨胀,这两种工具是用于 RNA-seq 数据差异表达分析的主要工具。在这里,我们表明,通过使用 winsorization 正确处理 RNA-Seq 数据中的异常值,可以大大降低 DESeq2 和 edgeR 的 I 型错误率,并且对于大型数据集,其功效可与 Wilcoxon 秩和检验相媲美。因此,作为 Wilcoxon 秩和检验的替代方法,它们仍可用于大型 RNA-Seq 数据集的差异表达分析。