Wang Yue, Wang Xuan, Cui Xiaodong, Meng Jia, Rong Rong
Department of Mathematical Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, China.
Department of Computer Science, University of Liverpool, L69 7ZB Liverpool, UK.
Mol Ther Nucleic Acids. 2023 Jan 27;31:411-420. doi: 10.1016/j.omtn.2023.01.014. eCollection 2023 Mar 14.
Dihydrouridine (D) is a modified pyrimidine nucleotide universally found in viral, prokaryotic, and eukaryotic species. It serves as a metabolic modulator for various pathological conditions, and its elevated levels in tumors are associated with a series of cancers. Precise identification of D sites on RNA is vital for understanding its biological function. A number of computational approaches have been developed for predicting D sites on tRNAs; however, none have considered mRNAs. We present here DPred, the first computational tool for predicting D on mRNAs in yeast from the primary RNA sequences. Built on a local self-attention layer and a convolutional neural network (CNN) layer, the proposed deep learning model outperformed classic machine learning approaches (random forest, support vector machines, etc.) and achieved reasonable accuracy and reliability with areas under the curve of 0.9166 and 0.9027 in jackknife cross-validation and on an independent testing dataset, respectively. Importantly, we showed that distinct sequence signatures are associated with the D sites on mRNAs and tRNAs, implying potentially different formation mechanisms and putative divergent functionality of this modification on the two types of RNA. DPred is available as a user-friendly Web server.
二氢尿苷(D)是一种修饰的嘧啶核苷酸,普遍存在于病毒、原核生物和真核生物中。它作为各种病理状况的代谢调节剂,其在肿瘤中的水平升高与一系列癌症相关。精确识别RNA上的D位点对于理解其生物学功能至关重要。已经开发了许多计算方法来预测tRNA上的D位点;然而,没有一种方法考虑过mRNA。我们在此展示DPred,这是第一个从RNA一级序列预测酵母mRNA上D位点的计算工具。基于局部自注意力层和卷积神经网络(CNN)层构建的深度学习模型优于经典机器学习方法(随机森林、支持向量机等),在留一法交叉验证和独立测试数据集上分别达到了0.9166和0.9027的曲线下面积,具有合理的准确性和可靠性。重要的是,我们表明不同的序列特征与mRNA和tRNA上的D位点相关,这意味着这种修饰在这两种类型的RNA上可能具有不同的形成机制和假定的不同功能。DPred作为一个用户友好的网络服务器可用。