Suppr超能文献

深度蛋白质结合预测:基于Transformer的深度学习模型进行结合蛋白预测。

Deep-ProBind: binding protein prediction with transformer-based deep learning model.

作者信息

Khan Salman, Noor Sumaiya, Awan Hamid Hussain, Iqbal Shehryar, AlQahtani Salman A, Dilshad Naqqash, Ahmad Nijad

机构信息

Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, KPK, Pakistan.

Business and Management Sciences Department, Purdue University, West Lafayette, IN, USA.

出版信息

BMC Bioinformatics. 2025 Mar 22;26(1):88. doi: 10.1186/s12859-025-06101-8.

Abstract

Binding proteins play a crucial role in biological systems by selectively interacting with specific molecules, such as DNA, RNA, or peptides, to regulate various cellular processes. Their ability to recognize and bind target molecules with high specificity makes them essential for signal transduction, transport, and enzymatic activity. Traditional experimental methods for identifying protein-binding peptides are costly and time-consuming. Current sequence-based approaches often struggle with accuracy, focusing too narrowly on proximal sequence features and ignoring structural data. This study presents Deep-ProBind, a powerful prediction model designed to classify protein binding sites by integrating sequence and structural information. The proposed model employs a transformer and evolutionary-based attention mechanism, i.e., Bidirectional Encoder Representations from Transformers (BERT) and Pseudo position specific scoring matrix -Discrete Wavelet Transform (PsePSSM -DWT) approach to encode peptides. The SHapley Additive exPlanations (SHAP) algorithm selects the optimal hybrid features, and a Deep Neural Network (DNN) is then used as the classification algorithm to predict protein-binding peptides. The performance of the proposed model was evaluated in comparison with traditional Machine Learning (ML) algorithms and existing models. Experimental results demonstrate that Deep-ProBind achieved 92.67% accuracy with tenfold cross-validation on benchmark datasets and 93.62% accuracy on independent samples. The Deep-ProBind outperforms existing models by 3.57% on training data and 1.52% on independent tests. These results demonstrate Deep-ProBind's reliability and effectiveness, making it a valuable tool for researchers and a potential resource in pharmacological studies, where peptide binding plays a critical role in therapeutic development.

摘要

结合蛋白通过与特定分子(如DNA、RNA或肽)选择性相互作用来调节各种细胞过程,在生物系统中发挥着关键作用。它们以高特异性识别和结合靶分子的能力使其成为信号转导、运输和酶活性所必需的。传统的鉴定蛋白结合肽的实验方法成本高且耗时。当前基于序列的方法往往在准确性方面存在困难,过于狭隘地关注近端序列特征而忽略了结构数据。本研究提出了Deep-ProBind,这是一种强大的预测模型,旨在通过整合序列和结构信息来对蛋白结合位点进行分类。所提出的模型采用了一种基于变压器和进化的注意力机制,即来自变压器的双向编码器表示(BERT)和伪位置特异性评分矩阵 - 离散小波变换(PsePSSM - DWT)方法来编码肽。SHapley加法解释(SHAP)算法选择最佳混合特征,然后使用深度神经网络(DNN)作为分类算法来预测蛋白结合肽。与传统机器学习(ML)算法和现有模型相比,对所提出模型的性能进行了评估。实验结果表明,Deep-ProBind在基准数据集上进行十折交叉验证时准确率达到92.67%,在独立样本上的准确率为93.62%。在训练数据上,Deep-ProBind比现有模型高出3.57%,在独立测试中高出1.52%。这些结果证明了Deep-ProBind的可靠性和有效性,使其成为研究人员的宝贵工具以及药理学研究中的潜在资源,在药理学研究中肽结合在治疗开发中起着关键作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d13a/11929993/625d8d3f7ef6/12859_2025_6101_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验