融合三维结构和序列信息进行蛋白质-蛋白质相互作用预测。

Amalgamation of 3D structure and sequence information for protein-protein interaction prediction.

机构信息

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, Bihar, 801103, India.

出版信息

Sci Rep. 2020 Nov 5;10(1):19171. doi: 10.1038/s41598-020-75467-x.

DOI:10.1038/s41598-020-75467-x

PMID:33154416

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7645622/

Abstract

Protein is the primary building block of living organisms. It interacts with other proteins and is then involved in various biological processes. Protein-protein interactions (PPIs) help in predicting and hence help in understanding the functionality of the proteins, causes and growth of diseases, and designing new drugs. However, there is a vast gap between the available protein sequences and the identification of protein-protein interactions. To bridge this gap, researchers proposed several computational methods to reveal the interactions between proteins. These methods merely depend on sequence-based information of proteins. With the advancement of technology, different types of information related to proteins are available such as 3D structure information. Nowadays, deep learning techniques are adopted successfully in various domains, including bioinformatics. So, current work focuses on the utilization of different modalities, such as 3D structures and sequence-based information of proteins, and deep learning algorithms to predict PPIs. The proposed approach is divided into several phases. We first get several illustrations of proteins using their 3D coordinates information, and three attributes, such as hydropathy index, isoelectric point, and charge of amino acids. Amino acids are the building blocks of proteins. A pre-trained ResNet50 model, a subclass of a convolutional neural network, is utilized to extract features from these representations of proteins. Autocovariance and conjoint triad are two widely used sequence-based methods to encode proteins, which are used here as another modality of protein sequences. A stacked autoencoder is utilized to get the compact form of sequence-based information. Finally, the features obtained from different modalities are concatenated in pairs and fed into the classifier to predict labels for protein pairs. We have experimented on the human PPIs dataset and Saccharomyces cerevisiae PPIs dataset and compared our results with the state-of-the-art deep-learning-based classifiers. The results achieved by the proposed method are superior to those obtained by the existing methods. Extensive experimentations on different datasets indicate that our approach to learning and combining features from two different modalities is useful in PPI prediction.

摘要

蛋白质是生物的主要组成部分。它与其他蛋白质相互作用，然后参与各种生物过程。蛋白质-蛋白质相互作用 (PPIs) 有助于预测，从而有助于理解蛋白质的功能、疾病的原因和发展以及设计新药。然而，可用的蛋白质序列和蛋白质-蛋白质相互作用的识别之间存在巨大差距。为了弥合这一差距，研究人员提出了几种计算方法来揭示蛋白质之间的相互作用。这些方法仅依赖于蛋白质的基于序列的信息。随着技术的进步，与蛋白质相关的不同类型的信息可用，例如 3D 结构信息。如今，深度学习技术已成功应用于各个领域，包括生物信息学。因此，当前的工作重点是利用不同的模态，例如蛋白质的 3D 结构和基于序列的信息，以及深度学习算法来预测 PPIs。所提出的方法分为几个阶段。我们首先使用蛋白质的 3D 坐标信息获取蛋白质的多个图像，并使用三个属性，如疏水性指数、等电点和氨基酸电荷。氨基酸是蛋白质的组成部分。使用预训练的 ResNet50 模型，即卷积神经网络的一个子类，从这些蛋白质表示中提取特征。自协方差和共联体是两种广泛使用的基于序列的方法，用于对蛋白质进行编码，这里将其用作蛋白质序列的另一种模态。堆叠自动编码器用于获取基于序列的信息的紧凑形式。最后，从不同模态获得的特征成对串联，并输入分类器以预测蛋白质对的标签。我们在人类 PPIs 数据集和 Saccharomyces cerevisiae PPIs 数据集上进行了实验，并将我们的结果与基于深度学习的最新分类器进行了比较。所提出方法的结果优于现有方法的结果。在不同数据集上的广泛实验表明，我们从两种不同模态学习和组合特征的方法对于 PPI 预测很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cab/7645622/358cc6d9ef5c/41598_2020_75467_Fig1_HTML.jpg

相似文献

Amalgamation of 3D structure and sequence information for protein-protein interaction prediction.

Sci Rep. 2020 Nov 5;10(1):19171. doi: 10.1038/s41598-020-75467-x.

Predicting Protein-Protein Interactions via Random Ferns with Evolutionary Matrix Representation.

Comput Math Methods Med. 2022 Feb 22;2022:7191684. doi: 10.1155/2022/7191684. eCollection 2022.

Predicting protein-protein interactions from primary protein sequences using a novel multi-scale local feature representation scheme and the random forest.

PLoS One. 2015 May 6;10(5):e0125811. doi: 10.1371/journal.pone.0125811. eCollection 2015.

Ens-PPI: A Novel Ensemble Classifier for Predicting the Interactions of Proteins Using Autocovariance Transformation from PSSM.

Biomed Res Int. 2016;2016:4563524. doi: 10.1155/2016/4563524. Epub 2016 Jun 29.

SDNN-PPI: self-attention with deep neural network effect on protein-protein interaction prediction.

BMC Genomics. 2022 Jun 27;23(1):474. doi: 10.1186/s12864-022-08687-2.

Protein features fusion using attributed network embedding for predicting protein-protein interaction.

BMC Genomics. 2024 May 13;25(1):466. doi: 10.1186/s12864-024-10361-8.

Protein-Protein Interactions Prediction Using a Novel Local Conjoint Triad Descriptor of Amino Acid Sequences.

Int J Mol Sci. 2017 Nov 8;18(11):2373. doi: 10.3390/ijms18112373.

Improved protein-protein interactions prediction via weighted sparse representation model combining continuous wavelet descriptor and PseAA composition.

BMC Syst Biol. 2016 Dec 23;10(Suppl 4):120. doi: 10.1186/s12918-016-0360-6.

Predicting protein-protein interactions from protein sequences by a stacked sparse autoencoder deep neural network.

Mol Biosyst. 2017 Jun 27;13(7):1336-1344. doi: 10.1039/c7mb00188f.

Predicting protein-protein interactions using high-quality non-interacting pairs.

BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):525. doi: 10.1186/s12859-018-2525-3.

引用本文的文献

Topology-driven negative sampling enhances generalizability in protein-protein interaction prediction.

Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf148.

PIPENN-EMB ensemble net and protein embeddings generalise protein interface prediction beyond homology.

Sci Rep. 2025 Feb 5;15(1):4391. doi: 10.1038/s41598-025-88445-y.

Graph-based machine learning model for weight prediction in protein-protein networks.

BMC Bioinformatics. 2024 Nov 8;25(1):349. doi: 10.1186/s12859-024-05973-6.

An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction.

Int J Mol Sci. 2024 May 29;25(11):5957. doi: 10.3390/ijms25115957.

SpatialPPI: Three-dimensional space protein-protein interaction prediction with AlphaFold Multimer.

Comput Struct Biotechnol J. 2024 Mar 15;23:1214-1225. doi: 10.1016/j.csbj.2024.03.009. eCollection 2024 Dec.

Cracking the black box of deep sequence-based protein-protein interaction prediction.

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae076.

In Silico Analysis: Genome-Wide Identification, Characterization and Evolutionary Adaptations of Bone Morphogenetic Protein (BMP) Gene Family in Homo sapiens.

Mol Biotechnol. 2024 Nov;66(11):3336-3356. doi: 10.1007/s12033-023-00944-3. Epub 2023 Nov 1.

Graph-BERT and language model-based framework for protein-protein interaction identification.

Sci Rep. 2023 Apr 6;13(1):5663. doi: 10.1038/s41598-023-31612-w.

ProtInteract: A deep learning framework for predicting protein-protein interactions.

Comput Struct Biotechnol J. 2023 Jan 25;21:1324-1348. doi: 10.1016/j.csbj.2023.01.028. eCollection 2023.

Protein-protein interaction prediction with deep learning: A comprehensive review.

Comput Struct Biotechnol J. 2022 Sep 19;20:5316-5341. doi: 10.1016/j.csbj.2022.08.070. eCollection 2022.

本文引用的文献

Gene interaction network approach to elucidate the multidrug resistance mechanisms in the pathogenic bacterial strain Proteus mirabilis.

J Cell Physiol. 2021 Jan;236(1):468-479. doi: 10.1002/jcp.29874. Epub 2020 Jun 15.

Elucidating the multi-drug resistance mechanism of Enterococcus faecalis V583: A gene interaction network analysis.

Gene. 2020 Jul 20;748:144704. doi: 10.1016/j.gene.2020.144704. Epub 2020 Apr 24.

Role of SHV-11, a Class A β-Lactamase, Gene in Multidrug Resistance Among Strains and Understanding Its Mechanism by Gene Network Analysis.

Microb Drug Resist. 2020 Aug;26(8):900-908. doi: 10.1089/mdr.2019.0430. Epub 2020 Mar 2.

Gene interaction network studies to decipher the multi-drug resistance mechanism in Salmonella enterica serovar Typhi CT18 reveal potential drug targets.

Microb Pathog. 2020 Feb 22;142:104096. doi: 10.1016/j.micpath.2020.104096.

Systems biology studies in Pseudomonas aeruginosa PA01 to understand their role in biofilm formation and multidrug efflux pumps.

Microb Pathog. 2019 Nov;136:103668. doi: 10.1016/j.micpath.2019.103668. Epub 2019 Aug 13.

Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features.

Curr Protoc Protein Sci. 2018 Aug;93(1):e62. doi: 10.1002/cpps.62. Epub 2018 Jun 21.

Exploring the multi-drug resistance in Escherichia coli O157:H7 by gene interaction network: A systems biology approach.

Genomics. 2019 Jul;111(4):958-965. doi: 10.1016/j.ygeno.2018.06.002. Epub 2018 Jun 13.

EnzyNet: enzyme classification using 3D convolutional neural networks on spatial representation.

PeerJ. 2018 May 4;6:e4750. doi: 10.7717/peerj.4750. eCollection 2018.

Sequence-based prediction of protein protein interaction using a deep-learning algorithm.

BMC Bioinformatics. 2017 May 25;18(1):277. doi: 10.1186/s12859-017-1700-2.

DeepPPI: Boosting Prediction of Protein-Protein Interactions with Deep Neural Networks.

J Chem Inf Model. 2017 Jun 26;57(6):1499-1510. doi: 10.1021/acs.jcim.7b00028. Epub 2017 May 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

融合三维结构和序列信息进行蛋白质-蛋白质相互作用预测。

Amalgamation of 3D structure and sequence information for protein-protein interaction prediction.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献