Liang Sirui, Zhao Yanxi, Jin Junru, Qiao Jianbo, Wang Ding, Wang Yu, Wei Leyi
School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.
School of Software, Shandong University, Jinan, 250101, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, 250101, China.
Comput Biol Med. 2023 Sep;164:107238. doi: 10.1016/j.compbiomed.2023.107238. Epub 2023 Jul 8.
Recent research has highlighted the pivotal role of RNA post-transcriptional modifications in the regulation of RNA expression and function. Accurate identification of RNA modification sites is important for understanding RNA function. In this study, we propose a novel RNA modification prediction method, namely Rm-LR, which leverages a long-range-based deep learning approach to accurately predict multiple types of RNA modifications using RNA sequences only. Rm-LR incorporates two large-scale RNA language pre-trained models to capture discriminative sequential information and learn local important features, which are subsequently integrated through a bilinear attention network. Rm-LR supports a total of ten RNA modification types (mA, mA, mC, mU, mAm, Ψ, Am, Cm, Gm, and Um) and significantly outperforms the state-of-the-art methods in terms of predictive capability on benchmark datasets. Experimental results show the effectiveness and superiority of Rm-LR in prediction of various RNA modifications, demonstrating the strong adaptability and robustness of our proposed model. We demonstrate that RNA language pretrained models enable to learn dense biological sequential representations from large-scale long-range RNA corpus, and meanwhile enhance the interpretability of the models. This work contributes to the development of accurate and reliable computational models for RNA modification prediction, providing insights into the complex landscape of RNA modifications.
近期研究突出了RNA转录后修饰在RNA表达和功能调控中的关键作用。准确识别RNA修饰位点对于理解RNA功能至关重要。在本研究中,我们提出了一种新颖的RNA修饰预测方法,即Rm-LR,它利用基于长程的深度学习方法,仅使用RNA序列就能准确预测多种类型的RNA修饰。Rm-LR整合了两个大规模的RNA语言预训练模型,以捕获有区分性的序列信息并学习局部重要特征,随后通过双线性注意力网络将这些特征整合起来。Rm-LR总共支持十种RNA修饰类型(mA、mA、mC、mU、mAm、Ψ、Am、Cm、Gm和Um),并且在基准数据集上的预测能力方面显著优于现有方法。实验结果表明Rm-LR在预测各种RNA修饰方面的有效性和优越性,证明了我们提出的模型具有很强的适应性和鲁棒性。我们证明RNA语言预训练模型能够从大规模长程RNA语料库中学习密集的生物学序列表示,同时增强模型的可解释性。这项工作有助于开发用于RNA修饰预测的准确可靠的计算模型,为RNA修饰的复杂图景提供了见解。