Suppr超能文献

ELMo4m6A:一种基于上下文语言嵌入的RNA N6-甲基腺苷位点检测预测器。

ELMo4m6A: A Contextual Language Embedding-Based Predictor for Detecting RNA N6-Methyladenosine Sites.

作者信息

Fan Yongxian, Sun Guicong, Pan Xiaoyong

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):944-954. doi: 10.1109/TCBB.2022.3173323. Epub 2023 Apr 3.

Abstract

N6-methyladenosine (m6A) is a universal post-transcriptional modification of RNAs, and it is widely involved in various biological processes. Identifying m6A modification sites accurately is indispensable to further investigate m6A-mediated biological functions. How to better represent RNA sequences is crucial for building effective computational methods for detecting m6A modification sites. However, traditional encoding methods require complex biological prior knowledge and are time-consuming. Furthermore, most of the existing m6A sites prediction methods are limited to single species, and few methods are able to predict m6A sites across different species and tissues. Thus, it is necessary to design a more efficient computational method to predict m6A sites across multiple species and tissues. In this paper, we proposed ELMo4m6A, a contextual language embedding-based method for predicting m6A sites from RNA sequences without any prior knowledge. ELMo4m6A first learns embeddings of RNA sequences using a language model ELMo, then uses a hybrid convolutional neural network (CNN) and long short-term memory (LSTM) to identify m6A sites. The results of 5-fold cross-validation and independent testing demonstrate that ELMo4m6A is superior to state-of-the-art methods. Moreover, we applied integrated gradients to find potential sequence patterns contributing to m6A sites.

摘要

N6-甲基腺苷(m6A)是RNA普遍的转录后修饰,广泛参与各种生物学过程。准确识别m6A修饰位点对于进一步研究m6A介导的生物学功能不可或缺。如何更好地表示RNA序列对于构建检测m6A修饰位点的有效计算方法至关重要。然而,传统的编码方法需要复杂的生物学先验知识且耗时。此外,现有的大多数m6A位点预测方法仅限于单一物种,很少有方法能够跨不同物种和组织预测m6A位点。因此,有必要设计一种更有效的计算方法来跨多个物种和组织预测m6A位点。在本文中,我们提出了ELMo4m6A,这是一种基于上下文语言嵌入的方法,用于在没有任何先验知识的情况下从RNA序列预测m6A位点。ELMo4m6A首先使用语言模型ELMo学习RNA序列的嵌入,然后使用混合卷积神经网络(CNN)和长短期记忆(LSTM)来识别m6A位点。五折交叉验证和独立测试的结果表明,ELMo4m6A优于现有方法。此外,我们应用集成梯度来寻找有助于m6A位点的潜在序列模式。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验