Department of Statistics, School of Economics, Hangzhou Dianzi University, Hangzhou, China.
PLoS One. 2024 Mar 27;19(3):e0299358. doi: 10.1371/journal.pone.0299358. eCollection 2024.
Single-cell RNA sequencing (scRNA-seq) is a high-throughput experimental technique for studying gene expression at the single-cell level. As a key component of single-cell data analysis, differential expression analysis (DEA) serves as the foundation for all subsequent secondary studies. Despite the fact that biological replicates are of vital importance in DEA process, small biological replication is still common in sequencing experiment now, which may impose problems to current DEA methods. Therefore, it is necessary to conduct a thorough comparison of various DEA approaches under small biological replications. Here, we compare 6 performance metrics on both simulated and real scRNA-seq datasets to assess the adaptability of 8 DEA approaches, with a particular emphasis on how well they function under small biological replications. Our findings suggest that DEA algorithms extended from bulk RNA-seq are still competitive under small biological replicate conditions, whereas the newly developed method DEF-scRNA-seq which is based on information entropy offers significant advantages. Our research not only provides appropriate suggestions for selecting DEA methods under different conditions, but also emphasizes the application value of machine learning algorithms in this field.
单细胞 RNA 测序 (scRNA-seq) 是一种高通量的实验技术,用于研究单细胞水平的基因表达。作为单细胞数据分析的关键组成部分,差异表达分析 (DEA) 是所有后续二次研究的基础。尽管生物学重复在 DEA 过程中至关重要,但在测序实验中,小的生物学重复仍然很常见,这可能会给当前的 DEA 方法带来问题。因此,有必要在小的生物学重复下对各种 DEA 方法进行彻底的比较。在这里,我们比较了模拟和真实 scRNA-seq 数据集上的 6 个性能指标,以评估 8 种 DEA 方法的适应性,特别关注它们在小的生物学重复下的功能。我们的研究结果表明,从批量 RNA-seq 扩展而来的 DEA 算法在小的生物学重复条件下仍然具有竞争力,而基于信息熵的新开发的方法 DEF-scRNA-seq 具有显著的优势。我们的研究不仅为在不同条件下选择 DEA 方法提供了适当的建议,还强调了机器学习算法在该领域的应用价值。