Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA.
Institute of Oceanography and Environment (INOS), Universiti Malaysia Terengganu, 21030, Kuala Nerus, Terengganu, Malaysia.
BMC Bioinformatics. 2024 Nov 19;25(1):360. doi: 10.1186/s12859-024-05978-1.
RNA 5-methyluridine (m5U) modifications play a crucial role in biological processes, making their accurate identification a key focus in computational biology. This paper introduces Deep-m5U, a robust predictor designed to enhance the prediction of m5U modifications. The proposed method, named Deep-m5U, utilizes a hybrid pseudo-K-tuple nucleotide composition (PseKNC) for sequence formulation, a Shapley Additive exPlanations (SHAP) algorithm for discriminant feature selection, and a deep neural network (DNN) as the classifier.
The model was evaluated using two benchmark datasets, i.e., Full Transcript and Mature mRNA. Deep-m5U achieved overall accuracies of 91.47% and 95.86% for the Full Transcript and Mature mRNA datasets with 10-fold cross-validation, and for independent samples, the model attained 92.94% and 95.17% accuracy.
Compared to existing models, Deep-m5U showed approximately 5.23% and 3.73% higher accuracy on the training data and 3.95% and 3.26% higher accuracy on independent samples for the Full Transcript and Mature mRNA datasets, respectively. The reliability and effectiveness of Deep-m5U make it a valuable tool for scientists and a potential asset in pharmaceutical design and research.
RNA 5-甲基尿嘧啶(m5U)修饰在生物过程中起着至关重要的作用,因此准确识别它们是计算生物学的一个关键焦点。本文介绍了 Deep-m5U,这是一种强大的预测器,旨在增强 m5U 修饰的预测。所提出的方法名为 Deep-m5U,它使用混合伪 K-元核苷酸组成(PseKNC)进行序列制定,使用 Shapley Additive exPlanations(SHAP)算法进行判别特征选择,以及使用深度神经网络(DNN)作为分类器。
该模型使用两个基准数据集(即完整转录本和成熟 mRNA)进行了评估。Deep-m5U 在 10 倍交叉验证时对完整转录本和成熟 mRNA 数据集的总体准确率分别为 91.47%和 95.86%,对于独立样本,模型的准确率分别为 92.94%和 95.17%。
与现有模型相比,Deep-m5U 在训练数据上分别提高了约 5.23%和 3.73%,在独立样本上分别提高了 3.95%和 3.26%。Deep-m5U 的可靠性和有效性使其成为科学家的有价值工具,并有可能成为药物设计和研究的资产。