• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CSM-毒素:一种用于预测蛋白质毒性的网络服务器。

CSM-Toxin: A Web-Server for Predicting Protein Toxicity.

作者信息

Morozov Vladimir, Rodrigues Carlos H M, Ascher David B

机构信息

School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD 4072, Australia.

Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia.

出版信息

Pharmaceutics. 2023 Jan 28;15(2):431. doi: 10.3390/pharmaceutics15020431.

DOI:10.3390/pharmaceutics15020431
PMID:36839752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9966851/
Abstract

Biologics are one of the most rapidly expanding classes of therapeutics, but can be associated with a range of toxic properties. In small-molecule drug development, early identification of potential toxicity led to a significant reduction in clinical trial failures, however we currently lack robust qualitative rules or predictive tools for peptide- and protein-based biologics. To address this, we have manually curated the largest set of high-quality experimental data on peptide and protein toxicities, and developed CSM-Toxin, a novel in-silico protein toxicity classifier, which relies solely on the protein primary sequence. Our approach encodes the protein sequence information using a deep learning natural languages model to understand "biological" language, where residues are treated as words and protein sequences as sentences. The CSM-Toxin was able to accurately identify peptides and proteins with potential toxicity, achieving an MCC of up to 0.66 across both cross-validation and multiple non-redundant blind tests, outperforming other methods and highlighting the robust and generalisable performance of our model. We strongly believe the CSM-Toxin will serve as a valuable platform to minimise potential toxicity in the biologic development pipeline. Our method is freely available as an easy-to-use webserver.

摘要

生物制剂是发展最为迅速的一类治疗药物,但可能具有一系列毒性特性。在小分子药物研发中,早期识别潜在毒性显著减少了临床试验失败的情况,然而目前我们缺乏针对基于肽和蛋白质的生物制剂的可靠定性规则或预测工具。为解决这一问题,我们人工整理了关于肽和蛋白质毒性的最大规模高质量实验数据集,并开发了CSM-Toxin,这是一种全新的基于计算机模拟的蛋白质毒性分类器,它仅依赖于蛋白质一级序列。我们的方法使用深度学习自然语言模型对蛋白质序列信息进行编码,以理解“生物”语言,其中氨基酸残基被视为单词,蛋白质序列被视为句子。CSM-Toxin能够准确识别具有潜在毒性的肽和蛋白质,在交叉验证和多个非冗余盲测中,马修斯相关系数(MCC)高达0.66,优于其他方法,突出了我们模型强大且通用的性能。我们坚信CSM-Toxin将成为一个有价值的平台,以尽量减少生物制剂研发过程中的潜在毒性。我们的方法以易于使用的网络服务器形式免费提供。

相似文献

1
CSM-Toxin: A Web-Server for Predicting Protein Toxicity.CSM-毒素:一种用于预测蛋白质毒性的网络服务器。
Pharmaceutics. 2023 Jan 28;15(2):431. doi: 10.3390/pharmaceutics15020431.
2
VISH-Pred: an ensemble of fine-tuned ESM models for protein toxicity prediction.VISH-Pred:一种用于蛋白质毒性预测的微调 ESM 模型的集成。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae270.
3
CSM-Potential: mapping protein interactions and biological ligands in 3D space using geometric deep learning.CSM-Potential:使用几何深度学习在 3D 空间中绘制蛋白质相互作用和生物配体图。
Nucleic Acids Res. 2022 Jul 5;50(W1):W204-W209. doi: 10.1093/nar/gkac381.
4
In silico approach for predicting toxicity of peptides and proteins.基于计算的方法预测肽和蛋白质的毒性。
PLoS One. 2013 Sep 13;8(9):e73957. doi: 10.1371/journal.pone.0073957. eCollection 2013.
5
CSM-peptides: A computational approach to rapid identification of therapeutic peptides.CSM-肽:一种快速鉴定治疗肽的计算方法。
Protein Sci. 2022 Oct;31(10):e4442. doi: 10.1002/pro.4442.
6
CSM-lig: a web server for assessing and comparing protein-small molecule affinities.CSM-lig:一个用于评估和比较蛋白质-小分子亲和力的网络服务器。
Nucleic Acids Res. 2016 Jul 8;44(W1):W557-61. doi: 10.1093/nar/gkw390. Epub 2016 May 5.
7
CSM-AB: graph-based antibody-antigen binding affinity prediction and docking scoring function.CSM-AB:基于图的抗体-抗原结合亲和力预测和对接评分函数。
Bioinformatics. 2022 Jan 27;38(4):1141-1143. doi: 10.1093/bioinformatics/btab762.
8
CSM-Potential2: A comprehensive deep learning platform for the analysis of protein interacting interfaces.CSM-Potential2:一个用于分析蛋白质相互作用界面的综合深度学习平台。
Proteins. 2025 Jan;93(1):209-216. doi: 10.1002/prot.26615. Epub 2023 Oct 23.
9
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
10
Predicting protein-peptide binding residues via interpretable deep learning.通过可解释的深度学习预测蛋白质-肽结合残基
Bioinformatics. 2022 Jun 27;38(13):3351-3360. doi: 10.1093/bioinformatics/btac352.

引用本文的文献

1
Accurate Prediction of Protein Tertiary and Quaternary Stability Using Fine-Tuned Protein Language Models and Free Energy Perturbation.使用微调蛋白质语言模型和自由能微扰准确预测蛋白质三级和四级结构稳定性
Int J Mol Sci. 2025 Jul 24;26(15):7125. doi: 10.3390/ijms26157125.
2
Exo-Tox: Identifying Exotoxins from secreted bacterial proteins.外毒素:从分泌的细菌蛋白中鉴定外毒素
BioData Min. 2025 Aug 8;18(1):52. doi: 10.1186/s13040-025-00469-2.
3
A scoping review of artificial intelligence applications in clinical trial risk assessment.

本文引用的文献

1
toxCSM: comprehensive prediction of small molecule toxicity profiles.toxCSM:小分子毒性特征的综合预测。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac337.
2
ToxinPred2: an improved method for predicting toxicity of proteins.ToxinPred2:一种改进的蛋白质毒性预测方法。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac174.
3
ProteinBERT: a universal deep-learning model of protein sequence and function.蛋白质 BERT:一种通用的蛋白质序列和功能深度学习模型。
人工智能在临床试验风险评估中的应用范围综述。
NPJ Digit Med. 2025 Jul 30;8(1):486. doi: 10.1038/s41746-025-01886-7.
4
Integration of pre-trained protein language models with equivariant graph neural networks for peptide toxicity prediction.将预训练的蛋白质语言模型与等变图神经网络集成用于肽毒性预测。
BMC Biol. 2025 Jul 28;23(1):229. doi: 10.1186/s12915-025-02329-1.
5
ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph.ToxiPep:通过上下文感知表示与原子级图融合进行肽毒性预测
Comput Struct Biotechnol J. 2025 May 28;27:2347-2358. doi: 10.1016/j.csbj.2025.05.039. eCollection 2025.
6
ToxDL 2.0: Protein toxicity prediction using a pretrained language model and graph neural networks.ToxDL 2.0:使用预训练语言模型和图神经网络进行蛋白质毒性预测。
Comput Struct Biotechnol J. 2025 Apr 2;27:1538-1549. doi: 10.1016/j.csbj.2025.04.002. eCollection 2025.
7
Advances of computational methods enhance the development of multi-epitope vaccines.计算方法的进步推动了多表位疫苗的发展。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf055.
8
Machine Learning-Enabled Drug-Induced Toxicity Prediction.基于机器学习的药物诱导毒性预测
Adv Sci (Weinh). 2025 Apr;12(16):e2413405. doi: 10.1002/advs.202413405. Epub 2025 Feb 3.
9
ToxGIN: an In silico prediction model for peptide toxicity via graph isomorphism networks integrating peptide sequence and structure information.ToxGIN:一种通过图同构网络整合肽序列和结构信息的肽毒性的计算预测模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae583.
10
Design of multivalent-epitope vaccine models directed toward the world's population against HIV-Gag polyprotein: Reverse vaccinology and immunoinformatics.针对全球人群的 HIV-Gag 多价表位疫苗模型设计:反向疫苗学和免疫信息学。
PLoS One. 2024 Sep 27;19(9):e0306559. doi: 10.1371/journal.pone.0306559. eCollection 2024.
Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020.
4
ToxIBTL: prediction of peptide toxicity based on information bottleneck and transfer learning.ToxIBTL:基于信息瓶颈和迁移学习的肽毒性预测
Bioinformatics. 2022 Mar 4;38(6):1514-1524. doi: 10.1093/bioinformatics/btac006.
5
Analysis of physicochemical properties of protein-protein interaction modulators suggests stronger alignment with the "rule of five".蛋白质-蛋白质相互作用调节剂的物理化学性质分析表明,其与“五规则”的契合度更高。
RSC Med Chem. 2021 Jul 27;12(10):1731-1749. doi: 10.1039/d1md00213a. eCollection 2021 Oct 20.
6
ATSE: a peptide toxicity predictor by exploiting structural and evolutionary information based on graph neural network and attention mechanism.ATSE:一种基于图神经网络和注意力机制利用结构和进化信息的肽毒性预测器。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab041.
7
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
8
ToxDL: deep learning using primary structure and domain embeddings for assessing protein toxicity.ToxDL:使用一级结构和域嵌入进行深度学习,以评估蛋白质毒性。
Bioinformatics. 2021 Jan 29;36(21):5159-5168. doi: 10.1093/bioinformatics/btaa656.
9
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
10
TOXIFY: a deep learning approach to classify animal venom proteins.TOXIFY:一种用于对动物毒液蛋白进行分类的深度学习方法。
PeerJ. 2019 Jun 28;7:e7200. doi: 10.7717/peerj.7200. eCollection 2019.