Suppr超能文献

基于有序回归和递归卷积神经网络的蛋白质-蛋白质相互作用预测。

Protein-protein interaction prediction based on ordinal regression and recurrent convolutional neural networks.

机构信息

School of Information Management, Shanghai Lixin University of Accounting and Finance, No. 995 Shangchuan Road, Shanghai, 201209, China.

Shanghai Key Laboratory of Intelligent Information Processing, and School of Computer Science, Fudan University, No. 220 Handan Road, Shanghai, 200433, China.

出版信息

BMC Bioinformatics. 2021 Oct 8;22(Suppl 6):485. doi: 10.1186/s12859-021-04369-0.

Abstract

BACKGROUND

Protein protein interactions (PPIs) are essential to most of the biological processes. The prediction of PPIs is beneficial to the understanding of protein functions and thus is helpful to pathological analysis, disease diagnosis and drug design etc. As the amount of protein data is growing fast in the post genomic era, high-throughput experimental methods are expensive and time-consuming for the prediction of PPIs. Thus, computational methods have attracted researcher's attention in recent years. A large number of computational methods have been proposed based on different protein sequence encoders.

RESULTS

Notably, the confidence score of a protein sequence pair could be regarded as a kind of measurement to PPIs. The higher the confidence score for one protein pair is, the more likely the protein pair interacts. Thus in this paper, a deep learning framework, called ordinal regression and recurrent convolutional neural network (OR-RCNN) method, is introduced to predict PPIs from the perspective of confidence score. It mainly contains two parts: the encoder part of protein sequence pair and the prediction part of PPIs by confidence score. In the first part, two recurrent convolutional neural networks (RCNNs) with shared parameters are applied to construct two protein sequence embedding vectors, which can automatically extract robust local features and sequential information from the protein pairs. Based on it, the two embedding vectors are encoded into one novel embedding vector by element-wise multiplication. By taking the ordinal information behind confidence score into consideration, ordinal regression is used to construct multiple sub-classifiers in the second part. The results of multiple sub-classifiers are aggregated to obtain the final confidence score. Following that, the existence of PPIs is determined by the confidence score. We set a threshold [Formula: see text], and say the interaction exists between the protein pair if its confidence score is bigger than [Formula: see text].

CONCLUSIONS

We applied our method to predict PPIs on data sets S. cerevisiae and Homo sapiens. Through experimental verification, our method outperforms state-of-the-art PPI prediction models.

摘要

背景

蛋白质相互作用(PPIs)对大多数生物过程至关重要。预测 PPIs 有助于理解蛋白质功能,从而有助于病理分析、疾病诊断和药物设计等。在后基因组时代,蛋白质数据量增长迅速,高通量实验方法昂贵且耗时,因此近年来计算方法引起了研究人员的关注。基于不同的蛋白质序列编码器,提出了大量的计算方法。

结果

值得注意的是,蛋白质序列对的置信度评分可以看作是预测 PPIs 的一种度量。蛋白质对的置信度评分越高,该蛋白质对相互作用的可能性就越大。因此,在本文中,我们从置信度评分的角度引入了一种称为有序回归和递归卷积神经网络(OR-RCNN)的深度学习框架来预测 PPIs。它主要包含两部分:蛋白质序列对的编码器部分和置信度评分预测 PPIs 部分。在第一部分中,应用两个具有共享参数的递归卷积神经网络(RCNN)来构建两个蛋白质序列嵌入向量,它们可以自动从蛋白质对中提取稳健的局部特征和序列信息。在此基础上,通过元素乘法将两个嵌入向量编码为一个新的嵌入向量。考虑到置信度评分背后的有序信息,在第二部分中使用有序回归来构建多个子分类器。通过聚合多个子分类器的结果来获得最终的置信度评分。然后,根据置信度评分来确定 PPIs 的存在。我们设置一个阈值[Formula: see text],如果蛋白质对的置信度评分大于[Formula: see text],则认为它们之间存在相互作用。

结论

我们将我们的方法应用于酿酒酵母和人类数据集上进行 PPIs 预测。通过实验验证,我们的方法优于最新的 PPI 预测模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e42d/8501564/a3a1b4b3ecec/12859_2021_4369_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验