Suppr超能文献

基于深度自动编码器和边际Fisher分析构建lncRNA-蛋白质相互作用的判别特征空间。

Constructing discriminative feature space for LncRNA-protein interaction based on deep autoencoder and marginal fisher analysis.

作者信息

Teng Zhixia, Zhang Yiran, Dai Qiguo, Wu Chengyan, Li Dan

机构信息

College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, Heilongjiang, China.

College of Computer Science and Engineering, Dalian Minzu University, Dalian, 116600, Liaoning, China.

出版信息

Comput Biol Med. 2023 May;157:106711. doi: 10.1016/j.compbiomed.2023.106711. Epub 2023 Feb 28.

Abstract

Long non-coding RNAs (lncRNAs) play important roles by regulating proteins in many biological processes and life activities. To uncover molecular mechanisms of lncRNA, it is very necessary to identify interactions of lncRNA with proteins. Recently, some machine learning methods were proposed to detect lncRNA-protein interactions according to the distribution of known interactions. The performances of these methods were largely dependent upon: (1) how exactly the distribution of known interactions was characterized by feature space; (2) how discriminative the feature space was for distinguishing lncRNA-protein interactions. Because the known interactions may be multiple and complex model, it remains a challenge to construct discriminative feature space for lncRNA-protein interactions. To resolve this problem, a novel method named DFRPI was developed based on deep autoencoder and marginal fisher analysis in this paper. Firstly, some initial features of lncRNA-protein interactions were extracted from the primary sequences and secondary structures of lncRNA and protein. Secondly, a deep autoencoder was exploited to learn encode parameters of the initial features to describe the known interactions precisely. Next, the marginal fisher analysis was employed to optimize the encode parameters of features to characterize a discriminative feature space of the lncRNA-protein interactions. Finally, a random forest-based predictor was trained on the discriminative feature space to detect lncRNA-protein interactions. Verified by a series of experiments, the results showed that our predictor achieved the precision of 0.920, recall of 0.916, accuracy of 0.918, MCC of 0.836, specificity of 0.920, sensitivity of 0.916 and AUC of 0.906 respectively, which outperforms the concerned methods for predicting lncRNA-protein interaction. It may be suggested that the proposed method can generate a reasonable and effective feature space for distinguishing lncRNA-protein interactions accurately. The code and data are available on https://github.com/D0ub1e-D/DFRPI.

摘要

长链非编码RNA(lncRNAs)通过在许多生物过程和生命活动中调节蛋白质发挥重要作用。为了揭示lncRNA的分子机制,识别lncRNA与蛋白质的相互作用非常必要。最近,一些机器学习方法被提出来根据已知相互作用的分布来检测lncRNA-蛋白质相互作用。这些方法的性能很大程度上取决于:(1)特征空间对已知相互作用分布的刻画有多准确;(2)特征空间对区分lncRNA-蛋白质相互作用的判别能力如何。由于已知相互作用可能是多样且复杂的模型,构建用于lncRNA-蛋白质相互作用的判别特征空间仍然是一个挑战。为了解决这个问题,本文基于深度自动编码器和边际Fisher分析开发了一种名为DFRPI的新方法。首先,从lncRNA和蛋白质的一级序列和二级结构中提取lncRNA-蛋白质相互作用的一些初始特征。其次,利用深度自动编码器学习初始特征的编码参数以精确描述已知相互作用。接下来,采用边际Fisher分析来优化特征的编码参数,以刻画lncRNA-蛋白质相互作用的判别特征空间。最后,在判别特征空间上训练基于随机森林的预测器来检测lncRNA-蛋白质相互作用。一系列实验验证结果表明,我们的预测器分别实现了0.920的精确率、0.916的召回率、0.918的准确率、0.836的马修斯相关系数、0.920的特异性、0.916的灵敏度和0.906的曲线下面积,优于相关的lncRNA-蛋白质相互作用预测方法。可以认为所提出的方法能够生成合理有效的特征空间以准确区分lncRNA-蛋白质相互作用。代码和数据可在https://github.com/D0ub1e-D/DFRPI上获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验