集成卷积和自注意力以提高肽毒性预测。

Integrated convolution and self-attention for improving peptide toxicity prediction.

机构信息

Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.

出版信息

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae297.

DOI:10.1093/bioinformatics/btae297

PMID:38696758

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11654579/

Abstract

MOTIVATION

Peptides are promising agents for the treatment of a variety of diseases due to their specificity and efficacy. However, the development of peptide-based drugs is often hindered by the potential toxicity of peptides, which poses a significant barrier to their clinical application. Traditional experimental methods for evaluating peptide toxicity are time-consuming and costly, making the development process inefficient. Therefore, there is an urgent need for computational tools specifically designed to predict peptide toxicity accurately and rapidly, facilitating the identification of safe peptide candidates for drug development.

RESULTS

We provide here a novel computational approach, CAPTP, which leverages the power of convolutional and self-attention to enhance the prediction of peptide toxicity from amino acid sequences. CAPTP demonstrates outstanding performance, achieving a Matthews correlation coefficient of approximately 0.82 in both cross-validation settings and on independent test datasets. This performance surpasses that of existing state-of-the-art peptide toxicity predictors. Importantly, CAPTP maintains its robustness and generalizability even when dealing with data imbalances. Further analysis by CAPTP reveals that certain sequential patterns, particularly in the head and central regions of peptides, are crucial in determining their toxicity. This insight can significantly inform and guide the design of safer peptide drugs.

AVAILABILITY AND IMPLEMENTATION

The source code for CAPTP is freely available at https://github.com/jiaoshihu/CAPTP.

摘要

动机

由于肽的特异性和功效，它们是治疗各种疾病的有前途的药物。然而，基于肽的药物的开发常常受到肽潜在毒性的阻碍，这对其临床应用构成了重大障碍。评估肽毒性的传统实验方法既耗时又昂贵，使开发过程效率低下。因此，迫切需要专门设计的计算工具来准确快速地预测肽毒性，从而有助于识别安全的肽候选物用于药物开发。

结果

我们在这里提供了一种新的计算方法 CAPTP，它利用卷积和自注意力的力量来增强从氨基酸序列预测肽毒性的能力。CAPTP 表现出色，在交叉验证设置和独立测试数据集上的马修斯相关系数（Matthews correlation coefficient）约为 0.82。这一性能超过了现有的最先进的肽毒性预测器。重要的是，CAPTP 即使在处理数据不平衡时也能保持其稳健性和通用性。CAPTP 的进一步分析表明，某些序列模式，特别是在肽的头部和中央区域，对于确定其毒性至关重要。这一见解可以为更安全的肽药物设计提供重要的信息和指导。

可用性和实现

CAPTP 的源代码可在 https://github.com/jiaoshihu/CAPTP 上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/50a3/11654579/cdca1fababa8/btae297f1.jpg

相似文献

Integrated convolution and self-attention for improving peptide toxicity prediction.集成卷积和自注意力以提高肽毒性预测。

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae297.

ToxGIN: an In silico prediction model for peptide toxicity via graph isomorphism networks integrating peptide sequence and structure information.ToxGIN：一种通过图同构网络整合肽序列和结构信息的肽毒性的计算预测模型。

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae583.

PLPTP: A Motif-based Interpretable Deep Learning Framework Based on Protein Language Models for Peptide Toxicity Prediction.PLPTP：一种基于基序的可解释深度学习框架，基于蛋白质语言模型进行肽毒性预测。

J Mol Biol. 2025 Jun 15;437(12):169115. doi: 10.1016/j.jmb.2025.169115. Epub 2025 Mar 28.

TP-LMMSG: a peptide prediction graph neural network incorporating flexible amino acid property representation.TP-LMMSG：一种融合了灵活的氨基酸性质表示的肽预测图神经网络。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae308.

ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning.ToxIBTL：基于信息瓶颈和迁移学习的肽毒性预测

Bioinformatics. 2022 Mar 4;38(6):1514-1524. doi: 10.1093/bioinformatics/btac006.

sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure.sAMPpred-GAT：基于图注意力网络和预测肽结构的抗菌肽预测。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac715.

DeepBP: Ensemble deep learning strategy for bioactive peptide prediction.DeepBP：用于生物活性肽预测的集成深度学习策略。

BMC Bioinformatics. 2024 Nov 11;25(1):352. doi: 10.1186/s12859-024-05974-5.

CAPTURE: Comprehensive anti-cancer peptide predictor with a unique amino acid sequence encoder.CAPTURE：具有独特氨基酸序列编码器的综合抗癌肽预测器。

Comput Biol Med. 2024 Jun;176:108538. doi: 10.1016/j.compbiomed.2024.108538. Epub 2024 May 3.

LLM4THP: a computing tool to identify tumor homing peptides by molecular and sequence representation of large language model based on two-layer ensemble model strategy.基于双层集成模型策略的基于大语言模型的分子和序列表示来识别肿瘤归巢肽的计算工具：LLM4THP。

Amino Acids. 2024 Oct 15;56(1):62. doi: 10.1007/s00726-024-03422-5.

ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides.ACP-MHCNN：一种准确的多头深度卷积神经网络，用于预测抗癌肽。

Sci Rep. 2021 Dec 8;11(1):23676. doi: 10.1038/s41598-021-02703-3.

引用本文的文献

Integration of pre-trained protein language models with equivariant graph neural networks for peptide toxicity prediction.将预训练的蛋白质语言模型与等变图神经网络集成用于肽毒性预测。

BMC Biol. 2025 Jul 28;23(1):229. doi: 10.1186/s12915-025-02329-1.

ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph.ToxiPep：通过上下文感知表示与原子级图融合进行肽毒性预测

Comput Struct Biotechnol J. 2025 May 28;27:2347-2358. doi: 10.1016/j.csbj.2025.05.039. eCollection 2025.

Transformer-based deep learning enables improved B-cell epitope prediction in parasitic pathogens: A proof-of-concept study on Fasciola hepatica.基于Transformer的深度学习可改善对寄生性病原体中B细胞表位的预测：对肝片吸虫的概念验证研究

PLoS Negl Trop Dis. 2025 Apr 29;19(4):e0012985. doi: 10.1371/journal.pntd.0012985. eCollection 2025 Apr.

Unlocking Antimicrobial Peptides: In Silico Proteolysis and Artificial Intelligence-Driven Discovery from Cnidarian Omics.解锁抗菌肽：来自刺胞动物组学的计算机模拟蛋白水解和人工智能驱动的发现

Molecules. 2025 Jan 25;30(3):550. doi: 10.3390/molecules30030550.

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae583.

本文引用的文献

Conv2Former: A Simple Transformer-Style ConvNet for Visual Recognition.Conv2Former：一种用于视觉识别的简单的类Transformer卷积网络。

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8274-8283. doi: 10.1109/TPAMI.2024.3401450. Epub 2024 Nov 6.

Deep-STP: a deep learning-based approach to predict snake toxin proteins by using word embeddings.深度序列到蛋白预测（Deep-STP）：一种基于深度学习的方法，通过词嵌入来预测蛇毒蛋白。

Front Med (Lausanne). 2024 Jan 17;10:1291352. doi: 10.3389/fmed.2023.1291352. eCollection 2023.

Accurately identifying hemagglutinin using sequence information and machine learning methods.使用序列信息和机器学习方法准确识别血凝素。

Front Med (Lausanne). 2023 Oct 31;10:1281880. doi: 10.3389/fmed.2023.1281880. eCollection 2023.

PeptideBERT: A Language Model Based on Transformers for Peptide Property Prediction.PeptideBERT：一种基于 Transformer 的用于预测肽性质的语言模型。

J Phys Chem Lett. 2023 Nov 23;14(46):10427-10434. doi: 10.1021/acs.jpclett.3c02398. Epub 2023 Nov 13.

Pmf-cpi: assessing drug selectivity with a pretrained multi-functional model for compound-protein interactions.Pmf-cpi：使用预训练的多功能化合物-蛋白质相互作用模型评估药物选择性。

J Cheminform. 2023 Oct 14;15(1):97. doi: 10.1186/s13321-023-00767-z.

Sequence Alignment/Map format: a comprehensive review of approaches and applications.序列比对/映射格式：方法和应用的全面综述。

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad320.

A First Computational Frame for Recognizing Heparin-Binding Protein.一种用于识别肝素结合蛋白的首个计算框架。

Diagnostics (Basel). 2023 Jul 24;13(14):2465. doi: 10.3390/diagnostics13142465.

BioSeq-Diabolo: Biological sequence similarity analysis using Diabolo.BioSeq-Diabolo：使用 Diabolo 进行生物序列相似性分析。

PLoS Comput Biol. 2023 Jun 20;19(6):e1011214. doi: 10.1371/journal.pcbi.1011214. eCollection 2023 Jun.

Deep generative model for drug design from protein target sequence.基于蛋白质靶点序列的药物设计深度生成模型。

J Cheminform. 2023 Mar 28;15(1):38. doi: 10.1186/s13321-023-00702-2.

CSM-Toxin: A Web-Server for Predicting Protein Toxicity.CSM-毒素：一种用于预测蛋白质毒性的网络服务器。

Pharmaceutics. 2023 Jan 28;15(2):431. doi: 10.3390/pharmaceutics15020431.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

集成卷积和自注意力以提高肽毒性预测。

Integrated convolution and self-attention for improving peptide toxicity prediction.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

动机

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献