Suppr超能文献

EMDLP:用于 RNA 甲基化位点预测的集成多尺度深度学习模型。

EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction.

机构信息

Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China.

School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.

出版信息

BMC Bioinformatics. 2022 Jun 8;23(1):221. doi: 10.1186/s12859-022-04756-1.

Abstract

BACKGROUND

Recent research recommends that epi-transcriptome regulation through post-transcriptional RNA modifications is essential for all sorts of RNA. Exact identification of RNA modification is vital for understanding their purposes and regulatory mechanisms. However, traditional experimental methods of identifying RNA modification sites are relatively complicated, time-consuming, and laborious. Machine learning approaches have been applied in the procedures of RNA sequence features extraction and classification in a computational way, which may supplement experimental approaches more efficiently. Recently, convolutional neural network (CNN) and long short-term memory (LSTM) have been demonstrated achievements in modification site prediction on account of their powerful functions in representation learning. However, CNN can learn the local response from the spatial data but cannot learn sequential correlations. And LSTM is specialized for sequential modeling and can access both the contextual representation but lacks spatial data extraction compared with CNN. There is strong motivation to construct a prediction framework using natural language processing (NLP), deep learning (DL) for these reasons.

RESULTS

This study presents an ensemble multiscale deep learning predictor (EMDLP) to identify RNA methylation sites in an NLP and DL way. It organically combines the dilated convolution and Bidirectional LSTM (BiLSTM), which helps to take better advantage of the local and global information for site prediction. The first step of EMDLP is to represent the RNA sequences in an NLP way. Thus, three encodings, e.g., RNA word embedding, One-hot encoding, and RGloVe, which is an improved learning method of word vector representation based on GloVe, are adopted to decipher sites from the viewpoints of the local and global information. Then, a dilated convolutional Bidirectional LSTM network (DCB) model is constructed with the dilated convolutional neural network (DCNN) followed by BiLSTM to extract potential contributing features for methylation site prediction. Finally, these three encoding methods are integrated by a soft vote to obtain better predictive performance. Experiment results on mA and mA reveal that the area under the receiver operating characteristic(AUROC) of EMDLP obtains respectively 95.56%, 85.24%, and outperforms the state-of-the-art models. To maximize user convenience, a user-friendly webserver for EMDLP was publicly available at http://www.labiip.net/EMDLP/index.php ( http://47.104.130.81/EMDLP/index.php ).

CONCLUSIONS

We developed a predictor for mA and mA methylation sites.

摘要

背景

最近的研究表明,通过转录后 RNA 修饰的 epi 转录组调控对于各种 RNA 都是必不可少的。准确识别 RNA 修饰对于了解其功能和调控机制至关重要。然而,传统的 RNA 修饰位点鉴定实验方法相对复杂、耗时且费力。机器学习方法已被应用于计算 RNA 序列特征提取和分类过程中,这可能会更有效地补充实验方法。最近,卷积神经网络(CNN)和长短期记忆(LSTM)由于其在表示学习方面的强大功能,在修饰位点预测方面取得了显著成就。然而,CNN 可以从空间数据中学习局部响应,但不能学习序列相关性。而 LSTM 专门用于序列建模,与 CNN 相比,它可以访问上下文表示,但缺乏空间数据提取。出于这些原因,强烈需要使用自然语言处理(NLP)和深度学习(DL)构建预测框架。

结果

本研究提出了一种基于自然语言处理(NLP)和深度学习(DL)的集成多尺度深度学习预测器(EMDLP),用于识别 RNA 甲基化位点。它有机地结合了扩张卷积和双向 LSTM(BiLSTM),有助于更好地利用局部和全局信息进行位点预测。EMDLP 的第一步是通过自然语言处理(NLP)的方式表示 RNA 序列。因此,采用了三种编码方法,例如 RNA 单词嵌入、One-hot 编码和 RGloVe,这是一种基于 GloVe 的改进的单词向量表示学习方法,从局部和全局信息的角度来破译位点。然后,构建了一个带有扩张卷积的双向 LSTM 网络(DCB)模型,该模型由扩张卷积神经网络(DCNN)和 BiLSTM 组成,用于提取甲基化位点预测的潜在贡献特征。最后,通过软投票将这三种编码方法集成在一起,以获得更好的预测性能。在 mA 和 mA 上的实验结果表明,EMDLP 的接收者操作特征(AUROC)面积分别为 95.56%和 85.24%,优于最先进的模型。为了最大限度地提高用户便利性,我们在 http://www.labiip.net/EMDLP/index.phphttp://47.104.130.81/EMDLP/index.php)上公开了一个用于 EMDLP 的用户友好型网络服务器。

结论

我们开发了一个用于 mA 和 mA 甲基化位点的预测器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/27fc/9178860/ba85829f36d2/12859_2022_4756_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验