Suppr超能文献

EDLMFC:一种具有多尺度特征组合的集成深度学习框架,用于 ncRNA-蛋白质相互作用预测。

EDLMFC: an ensemble deep learning framework with multi-scale features combination for ncRNA-protein interaction prediction.

机构信息

Department of Biomedical Engineering, Faculty of Environment and Life, Beijing International Science and Technology Cooperation Base for Intelligent Physiological Measurement and Clinical Transformation, Beijing University of Technology, Beijing, 100124, China.

出版信息

BMC Bioinformatics. 2021 Mar 19;22(1):133. doi: 10.1186/s12859-021-04069-9.

Abstract

BACKGROUND

Non-coding RNA (ncRNA) and protein interactions play essential roles in various physiological and pathological processes. The experimental methods used for predicting ncRNA-protein interactions are time-consuming and labor-intensive. Therefore, there is an increasing demand for computational methods to accurately and efficiently predict ncRNA-protein interactions.

RESULTS

In this work, we presented an ensemble deep learning-based method, EDLMFC, to predict ncRNA-protein interactions using the combination of multi-scale features, including primary sequence features, secondary structure sequence features, and tertiary structure features. Conjoint k-mer was used to extract protein/ncRNA sequence features, integrating tertiary structure features, then fed into an ensemble deep learning model, which combined convolutional neural network (CNN) to learn dominating biological information with bi-directional long short-term memory network (BLSTM) to capture long-range dependencies among the features identified by the CNN. Compared with other state-of-the-art methods under five-fold cross-validation, EDLMFC shows the best performance with accuracy of 93.8%, 89.7%, and 86.1% on RPI1807, NPInter v2.0, and RPI488 datasets, respectively. The results of the independent test demonstrated that EDLMFC can effectively predict potential ncRNA-protein interactions from different organisms. Furtherly, EDLMFC is also shown to predict hub ncRNAs and proteins presented in ncRNA-protein networks of Mus musculus successfully.

CONCLUSIONS

In general, our proposed method EDLMFC improved the accuracy of ncRNA-protein interaction predictions and anticipated providing some helpful guidance on ncRNA functions research. The source code of EDLMFC and the datasets used in this work are available at https://github.com/JingjingWang-87/EDLMFC .

摘要

背景

非编码 RNA(ncRNA)与蛋白质相互作用在多种生理和病理过程中起着至关重要的作用。用于预测 ncRNA-蛋白质相互作用的实验方法既耗时又费力。因此,越来越需要计算方法来准确、有效地预测 ncRNA-蛋白质相互作用。

结果

在这项工作中,我们提出了一种基于集成深度学习的方法 EDLMFC,使用多尺度特征(包括一级序列特征、二级结构序列特征和三级结构特征)的组合来预测 ncRNA-蛋白质相互作用。联合 k-mer 用于提取蛋白质/ncRNA 序列特征,整合三级结构特征,然后将其输入到一个集成深度学习模型中,该模型结合卷积神经网络(CNN)来学习主导生物信息,以及双向长短期记忆网络(BLSTM)来捕获由 CNN 识别的特征之间的长程依赖关系。在 5 倍交叉验证下,与其他最先进的方法相比,EDLMFC 在 RPI1807、NPInter v2.0 和 RPI488 数据集上的准确率分别为 93.8%、89.7%和 86.1%,表现出最佳性能。独立测试的结果表明,EDLMFC 可以有效地从不同的生物体中预测潜在的 ncRNA-蛋白质相互作用。此外,EDLMFC 还成功地预测了 Mus musculus 中 ncRNA-蛋白质网络中的 hub ncRNA 和蛋白质。

结论

总的来说,我们提出的方法 EDLMFC 提高了 ncRNA-蛋白质相互作用预测的准确性,并为 ncRNA 功能研究提供了一些有价值的指导。EDLMFC 的源代码和本工作中使用的数据集可在 https://github.com/JingjingWang-87/EDLMFC 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c77e/7980572/57de1e975cab/12859_2021_4069_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验