Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh.
Department of Statistics, Faculty of Science, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj 8100, Bangladesh; Zhejiang University-University of Edinburgh Institute, Zhejiang University School of Medicine, Haining 314400, China.
Genomics. 2024 May;116(3):110834. doi: 10.1016/j.ygeno.2024.110834. Epub 2024 Mar 26.
The edgeR (Robust) is a popular approach for identifying differentially expressed genes (DEGs) from RNA-Seq profiles. However, it shows weak performance against gene-specific outliers and is unable to handle missing observations. To address these issues, we proposed a pre-processing approach of RNA-Seq count data by combining the iLOO-based outlier detection and random forest-based missing imputation approach for boosting the performance of edgeR (Robust). Both simulation and real RNA-Seq count data analysis results showed that the proposed edgeR (Robust) outperformed than the conventional edgeR (Robust). To investigate the effectiveness of identified DEGs for diagnosis, and therapies of ovarian cancer (OC), we selected top-ranked 12 DEGs (IL6, XCL1, CXCL8, C1QC, C1QB, SNAI2, TYROBP, COL1A2, SNAP25, NTS, CXCL2, and AGT) and suggested hub-DEGs guided top-ranked 10 candidate drug-molecules for the treatment against OC. Hence, our proposed procedure might be an effective computational tool for exploring potential DEGs from RNA-Seq profiles for diagnosis and therapies of any disease.
边缘 R(稳健)是一种从 RNA-Seq 谱中识别差异表达基因(DEGs)的流行方法。然而,它在处理基因特异性异常值和缺失观测值方面表现不佳。为了解决这些问题,我们提出了一种 RNA-Seq 计数数据的预处理方法,该方法结合了基于 iLOO 的异常值检测和基于随机森林的缺失插补方法,以提高边缘 R(稳健)的性能。模拟和真实的 RNA-Seq 计数数据分析结果均表明,所提出的边缘 R(稳健)优于传统的边缘 R(稳健)。为了研究鉴定的 DEGs 对卵巢癌(OC)诊断和治疗的有效性,我们选择了排名前 12 的 DEGs(IL6、XCL1、CXCL8、C1QC、C1QB、SNAI2、TYROBP、COL1A2、SNAP25、NTS、CXCL2 和 AGT),并提出了基于 hub-DEGs 的前 10 个候选药物分子,用于治疗 OC。因此,我们提出的程序可能是一种从 RNA-Seq 谱中探索潜在 DEGs 的有效计算工具,可用于任何疾病的诊断和治疗。