Suppr超能文献

MIPPIS:一种融合多源信息的蛋白质相互作用位点预测网络。

MIPPIS: protein-protein interaction site prediction network with multi-information fusion.

机构信息

College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.

Department of Artificial Intelligence, Polytechnical University of Madrid, Madrid, 28031, Spain.

出版信息

BMC Bioinformatics. 2024 Nov 4;25(1):345. doi: 10.1186/s12859-024-05964-7.

Abstract

BACKGROUND

The prediction of protein-protein interaction sites plays a crucial role in biochemical processes. Investigating the interaction between viruses and receptor proteins through biological techniques aids in understanding disease mechanisms and guides the development of corresponding drugs. While various methods have been proposed in the past, they often suffer from drawbacks such as long processing times, high costs, and low accuracy.

RESULTS

Addressing these challenges, we propose a novel protein-protein interaction site prediction network based on multi-information fusion. In our approach, the initial amino acid features are depicted by the position-specific scoring matrix, hidden Markov model, dictionary of protein secondary structure, and one-hot encoding. Simultaneously, we adopt a multi-channel approach to extract deep-level amino acids features from different perspectives. The graph convolutional network channel effectively extracts spatial structural information. The bidirectional long short-term memory channel treats the amino acid sequence as natural language, capturing the protein's primary structure information. The ProtT5 protein large language model channel outputs a more comprehensive amino acid embedding representation, providing a robust complement to the two aforementioned channels. Finally, the obtained amino acid features are fed into the prediction layer for the final prediction.

CONCLUSION

Compared with six protein structure-based methods and six protein sequence-based methods, our model achieves optimal performance across evaluation metrics, including accuracy, precision, F, Matthews correlation coefficient, and area under the precision recall curve, which demonstrates the superiority of our model.

摘要

背景

蛋白质-蛋白质相互作用位点的预测在生化过程中起着至关重要的作用。通过生物技术研究病毒与受体蛋白之间的相互作用,有助于了解疾病机制,并指导相应药物的开发。虽然过去已经提出了各种方法,但它们往往存在处理时间长、成本高和准确性低等缺点。

结果

为了解决这些挑战,我们提出了一种基于多信息融合的新型蛋白质-蛋白质相互作用位点预测网络。在我们的方法中,初始氨基酸特征由位置特异性评分矩阵、隐马尔可夫模型、蛋白质二级结构词典和独热编码表示。同时,我们采用多通道方法从不同角度提取深层次的氨基酸特征。图卷积网络通道有效地提取空间结构信息。双向长短期记忆通道将氨基酸序列视为自然语言,捕获蛋白质的一级结构信息。ProtT5 蛋白质大语言模型通道输出更全面的氨基酸嵌入表示,为两个上述通道提供了有力的补充。最后,获得的氨基酸特征被输入到预测层进行最终预测。

结论

与六种基于蛋白质结构的方法和六种基于蛋白质序列的方法相比,我们的模型在评价指标(包括准确性、精度、F 值、马修斯相关系数和精度召回曲线下面积)上都取得了最优性能,这表明了我们模型的优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc3e/11536593/be7bcdf8dfa3/12859_2024_5964_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验