School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China; Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.
Comput Biol Med. 2024 Mar;171:108177. doi: 10.1016/j.compbiomed.2024.108177. Epub 2024 Feb 23.
With the increasing number of microRNAs (miRNAs), identifying essential miRNAs has become an important task that needs to be solved urgently. However, there are few computational methods for essential miRNA identification. Here, we proposed a novel framework called Rotation Forest for Essential MicroRNA identification (RFEM) to predict the essentiality of miRNAs in mice. We first constructed 1,264 miRNA features of all miRNA samples by fusing 38 miRNA features obtained from the PESM paper and 1,226 miRNA functional features calculated based on miRNA-target gene interactions. Then, we employed 182 training samples with 1,264 features to train the rotation forest model, which was applied to compute the essentiality scores of the candidate samples. The main innovations of RFEM were as follows: 1) miRNA functional features were introduced to enrich the diversity of miRNA features; 2) the rotation forest model used decision tree as the base classifier and could increase the difference among base classifiers through feature transformation to achieve better ensemble results. Experimental results show that RFEM significantly outperformed two previous models with the AUC (AUPR) of 0.942 (0.944) in three comparison experiments under 5-fold cross validation, which proved the model's reliable performance. Moreover, ablation study was further conducted to demonstrate the effectiveness of the novel miRNA functional features. Additionally, in the case studies of assessing the essentiality of unlabeled miRNAs, experimental literature confirmed that 7 of the top 10 predicted miRNAs have crucial biological functions in mice. Therefore, RFEM would be a reliable tool for identifying essential miRNAs.
随着 microRNAs(miRNAs)数量的增加,识别必需 miRNAs 已成为一项亟待解决的重要任务。然而,目前用于识别必需 miRNAs 的计算方法较少。在这里,我们提出了一种名为旋转森林用于必需 miRNA 识别(RFEM)的新框架,用于预测小鼠中 miRNAs 的必需性。我们首先通过融合来自 PESM 论文的 38 个 miRNA 特征和基于 miRNA-靶基因相互作用计算的 1,226 个 miRNA 功能特征,构建了所有 miRNA 样本的 1,264 个 miRNA 特征。然后,我们使用 182 个训练样本和 1,264 个特征来训练旋转森林模型,该模型用于计算候选样本的必需性得分。RFEM 的主要创新如下:1)引入 miRNA 功能特征来丰富 miRNA 特征的多样性;2)旋转森林模型使用决策树作为基础分类器,并通过特征转换增加基础分类器之间的差异,以实现更好的集成结果。实验结果表明,在 5 折交叉验证下的 3 个比较实验中,RFEM 的 AUC(AUPR)分别为 0.942(0.944),显著优于两个之前的模型,证明了模型的可靠性能。此外,还进行了消融研究以证明新型 miRNA 功能特征的有效性。此外,在评估未标记 miRNAs 的必需性的案例研究中,实验文献证实,预测的前 10 个 miRNAs 中有 7 个在小鼠中具有重要的生物学功能。因此,RFEM 将是识别必需 miRNAs 的可靠工具。