Suppr超能文献

一种基于Transformer的蛋白质-蛋白质相互作用位点预测集成框架。

A Transformer-Based Ensemble Framework for the Prediction of Protein-Protein Interaction Sites.

作者信息

Mou Minjie, Pan Ziqi, Zhou Zhimeng, Zheng Lingyan, Zhang Hanyu, Shi Shuiyang, Li Fengcheng, Sun Xiuna, Zhu Feng

机构信息

College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang UniversitySchool of Medicine, National Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China.

Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China.

出版信息

Research (Wash D C). 2023 Sep 27;6:0240. doi: 10.34133/research.0240. eCollection 2023.

Abstract

The identification of protein-protein interaction (PPI) sites is essential in the research of protein function and the discovery of new drugs. So far, a variety of computational tools based on machine learning have been developed to accelerate the identification of PPI sites. However, existing methods suffer from the low predictive accuracy or the limited scope of application. Specifically, some methods learned only global or local sequential features, leading to low predictive accuracy, while others achieved improved performance by extracting residue interactions from structures but were limited in their application scope for the serious dependence on precise structure information. There is an urgent need to develop a method that integrates comprehensive information to realize proteome-wide accurate profiling of PPI sites. Herein, a novel ensemble framework for PPI sites prediction, EnsemPPIS, was therefore proposed based on transformer and gated convolutional networks. EnsemPPIS can effectively capture not only global and local patterns but also residue interactions. Specifically, EnsemPPIS was unique in (a) extracting residue interactions from protein sequences with transformer and (b) further integrating global and local sequential features with the ensemble learning strategy. Compared with various existing methods, EnsemPPIS exhibited either superior performance or broader applicability on multiple PPI sites prediction tasks. Moreover, pattern analysis based on the interpretability of EnsemPPIS demonstrated that EnsemPPIS was fully capable of learning residue interactions within the local structure of PPI sites using only sequence information. The web server of EnsemPPIS is freely available at http://idrblab.org/ensemppis.

摘要

蛋白质-蛋白质相互作用(PPI)位点的识别在蛋白质功能研究和新药发现中至关重要。到目前为止,已经开发了多种基于机器学习的计算工具来加速PPI位点的识别。然而,现有方法存在预测准确率低或应用范围有限的问题。具体而言,一些方法仅学习全局或局部序列特征,导致预测准确率较低,而其他方法通过从结构中提取残基相互作用来提高性能,但由于严重依赖精确的结构信息,其应用范围受到限制。迫切需要开发一种整合综合信息的方法,以实现全蛋白质组范围内PPI位点的准确分析。在此,基于Transformer和门控卷积网络提出了一种用于PPI位点预测的新型集成框架EnsemPPIS。EnsemPPIS不仅可以有效地捕获全局和局部模式,还可以捕获残基相互作用。具体而言,EnsemPPIS的独特之处在于:(a)使用Transformer从蛋白质序列中提取残基相互作用;(b)通过集成学习策略进一步整合全局和局部序列特征。与各种现有方法相比,EnsemPPIS在多个PPI位点预测任务中表现出优异的性能或更广泛的适用性。此外,基于EnsemPPIS可解释性的模式分析表明,EnsemPPIS完全能够仅使用序列信息学习PPI位点局部结构内的残基相互作用。EnsemPPIS的网络服务器可在http://idrblab.org/ensemppis免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9936/10528219/56fdb782c8e0/research.0240.fig.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验