Suppr超能文献

LGFC-CNN:通过深度学习利用多种类型特征预测 lncRNA-蛋白质相互作用

LGFC-CNN: Prediction of lncRNA-Protein Interactions by Using Multiple Types of Features through Deep Learning.

机构信息

Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China.

College of Software, Jilin University, Changchun 130012, China.

出版信息

Genes (Basel). 2021 Oct 24;12(11):1689. doi: 10.3390/genes12111689.

Abstract

Long noncoding RNA (lncRNA) plays a crucial role in many critical biological processes and participates in complex human diseases through interaction with proteins. Considering that identifying lncRNA-protein interactions through experimental methods is expensive and time-consuming, we propose a novel method based on deep learning that combines raw sequence composition features, hand-designed features and structure features, called LGFC-CNN, to predict lncRNA-protein interactions. The two sequence preprocessing methods and CNN modules (GloCNN and LocCNN) are utilized to extract the raw sequence global and local features. Meanwhile, we select hand-designed features by comparing the predictive effect of different lncRNA and protein features combinations. Furthermore, we obtain the structure features and unifying the dimensions through Fourier transform. In the end, the four types of features are integrated to comprehensively predict the lncRNA-protein interactions. Compared with other state-of-the-art methods on three lncRNA-protein interaction datasets, LGFC-CNN achieves the best performance with an accuracy of 94.14%, on RPI21850; an accuracy of 92.94%, on RPI7317; and an accuracy of 98.19% on RPI1847. The results show that our LGFC-CNN can effectively predict the lncRNA-protein interactions by combining raw sequence composition features, hand-designed features and structure features.

摘要

长链非编码 RNA(lncRNA)在许多关键的生物过程中发挥着关键作用,并通过与蛋白质的相互作用参与复杂的人类疾病。考虑到通过实验方法鉴定 lncRNA-蛋白质相互作用既昂贵又耗时,我们提出了一种新的基于深度学习的方法,该方法结合了原始序列组成特征、人工设计特征和结构特征,称为 LGFC-CNN,用于预测 lncRNA-蛋白质相互作用。利用两种序列预处理方法和 CNN 模块(GloCNN 和 LocCNN)提取原始序列的全局和局部特征。同时,我们通过比较不同 lncRNA 和蛋白质特征组合的预测效果来选择人工设计的特征。此外,我们通过傅里叶变换获取结构特征并统一维度。最后,将这四种类型的特征整合起来,全面预测 lncRNA-蛋白质相互作用。在三个 lncRNA-蛋白质相互作用数据集上,与其他最先进的方法相比,LGFC-CNN 在 RPI21850 上的准确率达到 94.14%,在 RPI7317 上的准确率达到 92.94%,在 RPI1847 上的准确率达到 98.19%。结果表明,我们的 LGFC-CNN 可以通过结合原始序列组成特征、人工设计特征和结构特征,有效地预测 lncRNA-蛋白质相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2415/8621699/c4db24537d36/genes-12-01689-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验