Gauthier Marine, Agniel Denis, Thiébaut Rodolphe, Hejblum Boris P
INRIA SISTM, INSERM Bordeaux Population Health Research Center, University of Bordeaux, F-33000 Bordeaux, France.
Rand Corporation, Santa Monica, CA 90401, USA.
NAR Genom Bioinform. 2020 Nov 19;2(4):lqaa093. doi: 10.1093/nargab/lqaa093. eCollection 2020 Dec.
RNA-seq studies are growing in size and popularity. We provide evidence that the most commonly used methods for differential expression analysis (DEA) may yield too many false positive results in some situations. We present dearseq, a new method for DEA that controls the false discovery rate (FDR) without making any assumption about the true distribution of RNA-seq data. We show that dearseq controls the FDR while maintaining strong statistical power compared to the most popular methods. We demonstrate this behavior with mathematical proofs, simulations and a real data set from a study of tuberculosis, where our method produces fewer apparent false positives.
RNA测序研究的规模和受欢迎程度都在不断增长。我们提供的证据表明,在某些情况下,最常用的差异表达分析(DEA)方法可能会产生过多的假阳性结果。我们提出了dearseq,一种新的DEA方法,它可以控制错误发现率(FDR),而无需对RNA测序数据的真实分布做任何假设。我们表明,与最流行的方法相比,dearseq在控制FDR的同时保持了强大的统计功效。我们通过数学证明、模拟以及一项结核病研究的真实数据集来证明这种特性,在该数据集中我们的方法产生的明显假阳性更少。