Huang Yan, Wuchty Stefan, Zhou Yuan, Zhang Ziding
State Key Laboratory of Livestock and Poultry Biotechnology Breeding, College of Biological Sciences, China Agricultural University, Beijing 100193, China.
Department of Biomedical Informatics, Ministry of Education Key Laboratory of Molecular Cardiovascular Sciences, Center for Non-Coding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing 100191, China.
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad020.
While deep learning (DL)-based models have emerged as powerful approaches to predict protein-protein interactions (PPIs), the reliance on explicit similarity measures (e.g. sequence similarity and network neighborhood) to known interacting proteins makes these methods ineffective in dealing with novel proteins. The advent of AlphaFold2 presents a significant opportunity and also a challenge to predict PPIs in a straightforward way based on monomer structures while controlling bias from protein sequences. In this work, we established Structure and Graph-based Predictions of Protein Interactions (SGPPI), a structure-based DL framework for predicting PPIs, using the graph convolutional network. In particular, SGPPI focused on protein patches on the protein-protein binding interfaces and extracted the structural, geometric and evolutionary features from the residue contact map to predict PPIs. We demonstrated that our model outperforms traditional machine learning methods and state-of-the-art DL-based methods using non-representation-bias benchmark datasets. Moreover, our model trained on human dataset can be reliably transferred to predict yeast PPIs, indicating that SGPPI can capture converging structural features of protein interactions across various species. The implementation of SGPPI is available at https://github.com/emerson106/SGPPI.
虽然基于深度学习(DL)的模型已成为预测蛋白质-蛋白质相互作用(PPI)的强大方法,但依赖与已知相互作用蛋白质的显式相似性度量(例如序列相似性和网络邻域)使得这些方法在处理新蛋白质时效率低下。AlphaFold2的出现为基于单体结构直接预测PPI提供了重大机遇,同时也带来了挑战,即要控制来自蛋白质序列的偏差。在这项工作中,我们建立了基于结构和图的蛋白质相互作用预测(SGPPI),这是一种基于结构的DL框架,用于使用图卷积网络预测PPI。具体而言,SGPPI聚焦于蛋白质-蛋白质结合界面上的蛋白质补丁,并从残基接触图中提取结构、几何和进化特征来预测PPI。我们证明,使用无表示偏差的基准数据集,我们的模型优于传统机器学习方法和基于DL的最新方法。此外,我们在人类数据集上训练的模型可以可靠地转移到预测酵母PPI,这表明SGPPI可以捕获跨各种物种的蛋白质相互作用的收敛结构特征。SGPPI的实现可在https://github.com/emerson106/SGPPI上获取。