Suppr超能文献

HN-PPISP:一种基于MLP-Mixer的用于蛋白质-蛋白质相互作用位点预测的混合网络。

HN-PPISP: a hybrid network based on MLP-Mixer for protein-protein interaction site prediction.

作者信息

Kang Yan, Xu Yulong, Wang Xinchao, Pu Bin, Yang Xuekun, Rao Yulong, Chen Jianguo

机构信息

National Pilot School of Software, Yunnan University, Kunming, 650091, P.R. China.

College of Computer Science and Electronic Engineeringg, Hunan University, Changsha, 410082, P.R. China.

出版信息

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac480.

Abstract

MOTIVATION

Biological experimental approaches to protein-protein interaction (PPI) site prediction are critical for understanding the mechanisms of biochemical processes but are time-consuming and laborious. With the development of Deep Learning (DL) techniques, the most popular Convolutional Neural Networks (CNN)-based methods have been proposed to address these problems. Although significant progress has been made, these methods still have limitations in encoding the characteristics of each amino acid in protein sequences. Current methods cannot efficiently explore the nature of Position Specific Scoring Matrix (PSSM), secondary structure and raw protein sequences by processing them all together. For PPI site prediction, how to effectively model the PPI context with attention to prediction remains an open problem. In addition, the long-distance dependencies of PPI features are important, which is very challenging for many CNN-based methods because the innate ability of CNN is difficult to outperform auto-regressive models like Transformers.

RESULTS

To effectively mine the properties of PPI features, a novel hybrid neural network named HN-PPISP is proposed, which integrates a Multi-layer Perceptron Mixer (MLP-Mixer) module for local feature extraction and a two-stage multi-branch module for global feature capture. The model merits Transformer, TextCNN and Bi-LSTM as a powerful alternative for PPI site prediction. On the one hand, this is the first application of an advanced Transformer (i.e. MLP-Mixer) with a hybrid network for sequence-based PPI prediction. On the other hand, unlike existing methods that treat global features altogether, the proposed two-stage multi-branch hybrid module firstly assigns different attention scores to the input features and then encodes the feature through different branch modules. In the first stage, different improved attention modules are hybridized to extract features from the raw protein sequences, secondary structure and PSSM, respectively. In the second stage, a multi-branch network is designed to aggregate information from both branches in parallel. The two branches encode the features and extract dependencies through several operations such as TextCNN, Bi-LSTM and different activation functions. Experimental results on real-world public datasets show that our model consistently achieves state-of-the-art performance over seven remarkable baselines.

AVAILABILITY

The source code of HN-PPISP model is available at https://github.com/ylxu05/HN-PPISP.

摘要

动机

蛋白质-蛋白质相互作用(PPI)位点预测的生物学实验方法对于理解生化过程的机制至关重要,但耗时且费力。随着深度学习(DL)技术的发展,已经提出了最流行的基于卷积神经网络(CNN)的方法来解决这些问题。尽管已经取得了显著进展,但这些方法在编码蛋白质序列中每个氨基酸的特征方面仍然存在局限性。当前方法无法通过将位置特异性评分矩阵(PSSM)、二级结构和原始蛋白质序列一起处理来有效地探索它们的本质。对于PPI位点预测,如何有效地对PPI上下文进行建模以关注预测仍然是一个未解决的问题。此外,PPI特征的长距离依赖性很重要,这对许多基于CNN的方法来说非常具有挑战性,因为CNN的固有能力难以超越像Transformer这样的自回归模型。

结果

为了有效地挖掘PPI特征的属性,提出了一种名为HN-PPISP的新型混合神经网络,它集成了用于局部特征提取的多层感知器混合器(MLP-Mixer)模块和用于全局特征捕获的两阶段多分支模块。该模型将Transformer、TextCNN和Bi-LSTM作为PPI位点预测的有力替代方案。一方面,这是先进的Transformer(即MLP-Mixer)与混合网络在基于序列的PPI预测中的首次应用。另一方面,与现有方法将全局特征一起处理不同,所提出的两阶段多分支混合模块首先为输入特征分配不同的注意力分数,然后通过不同的分支模块对特征进行编码。在第一阶段,将不同的改进注意力模块进行混合,分别从原始蛋白质序列、二级结构和PSSM中提取特征。在第二阶段,设计了一个多分支网络来并行聚合来自两个分支的信息。两个分支通过TextCNN、Bi-LSTM和不同激活函数等几种操作对特征进行编码并提取依赖性。在真实世界公共数据集上的实验结果表明,我们的模型在七个显著的基线之上始终实现了当前最优的性能。

可用性

HN-PPISP模型的源代码可在https://github.com/ylxu05/HN-PPISP获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验