利用序列和结构特征预测蛋白酶底物。

Prediction of protease substrates using sequence and structure features.

机构信息

Graduate Group in Bioinformatics, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA.

出版信息

Bioinformatics. 2010 Jul 15;26(14):1714-22. doi: 10.1093/bioinformatics/btq267. Epub 2010 May 26.

DOI:10.1093/bioinformatics/btq267

PMID:20505003

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2894511/

Abstract

MOTIVATION

Granzyme B (GrB) and caspases cleave specific protein substrates to induce apoptosis in virally infected and neoplastic cells. While substrates for both types of proteases have been determined experimentally, there are many more yet to be discovered in humans and other metazoans. Here, we present a bioinformatics method based on support vector machine (SVM) learning that identifies sequence and structural features important for protease recognition of substrate peptides and then uses these features to predict novel substrates. Our approach can act as a convenient hypothesis generator, guiding future experiments by high-confidence identification of peptide-protein partners.

RESULTS

The method is benchmarked on the known substrates of both protease types, including our literature-curated GrB substrate set (GrBah). On these benchmark sets, the method outperforms a number of other methods that consider sequence only, predicting at a 0.87 true positive rate (TPR) and a 0.13 false positive rate (FPR) for caspase substrates, and a 0.79 TPR and a 0.21 FPR for GrB substrates. The method is then applied to approximately 25 000 proteins in the human proteome to generate a ranked list of predicted substrates of each protease type. Two of these predictions, AIF-1 and SMN1, were selected for further experimental analysis, and each was validated as a GrB substrate.

AVAILABILITY

All predictions for both protease types are publically available at http://salilab.org/peptide. A web server is at the same site that allows a user to train new SVM models to make predictions for any protein that recognizes specific oligopeptide ligands.

摘要

动机

颗粒酶 B (GrB) 和半胱天冬酶切割特定的蛋白质底物，以诱导病毒感染和肿瘤细胞凋亡。虽然已经通过实验确定了这两种类型的蛋白酶的底物，但在人类和其他后生动物中还有更多的底物有待发现。在这里，我们提出了一种基于支持向量机（SVM）学习的生物信息学方法，该方法可以识别对蛋白酶识别底物肽很重要的序列和结构特征，然后使用这些特征来预测新的底物。我们的方法可以作为一种方便的假设生成器，通过高度置信地识别肽-蛋白伙伴，指导未来的实验。

结果

该方法在两种蛋白酶类型的已知底物上进行了基准测试，包括我们文献整理的 GrB 底物集（GrBah）。在这些基准集上，该方法优于许多仅考虑序列的其他方法，对半胱天冬酶底物的预测准确率为 0.87（TPR）和假阳性率（FPR）为 0.13，对 GrB 底物的预测准确率为 0.79（TPR）和假阳性率（FPR）为 0.21。然后，该方法被应用于人类蛋白质组中的大约 25000 种蛋白质，以生成两种蛋白酶类型的预测底物的排名列表。其中两个预测结果，AIF-1 和 SMN1，被选中进行进一步的实验分析，并且每个预测结果都被验证为 GrB 底物。

可用性

两种蛋白酶类型的所有预测结果均可在 http://salilab.org/peptide 上公开获得。一个网络服务器位于同一站点，允许用户训练新的 SVM 模型，以便对识别特定寡肽配体的任何蛋白质进行预测。

相似文献

Prediction of protease substrates using sequence and structure features.

Bioinformatics. 2010 Jul 15;26(14):1714-22. doi: 10.1093/bioinformatics/btq267. Epub 2010 May 26.

Cascleave 2.0, a new approach for predicting caspase and granzyme cleavage targets.

Bioinformatics. 2014 Jan 1;30(1):71-80. doi: 10.1093/bioinformatics/btt603. Epub 2013 Oct 21.

PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.

PLoS One. 2012;7(11):e50300. doi: 10.1371/journal.pone.0050300. Epub 2012 Nov 29.

CASVM: web server for SVM-based prediction of caspase substrates cleavage sites.

Bioinformatics. 2007 Dec 1;23(23):3241-3. doi: 10.1093/bioinformatics/btm334. Epub 2007 Jun 28.

GraBCas: a bioinformatics tool for score-based prediction of Caspase- and Granzyme B-cleavage sites in protein sequences.

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W208-13. doi: 10.1093/nar/gki433.

Procleave: Predicting Protease-specific Substrate Cleavage Sites by Combining Sequence and Structural Information.

Genomics Proteomics Bioinformatics. 2020 Feb;18(1):52-64. doi: 10.1016/j.gpb.2019.08.002. Epub 2020 May 12.

SVM-based prediction of caspase substrate cleavage sites.

BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S14. doi: 10.1186/1471-2105-7-S5-S14.

Protease substrate site predictors derived from machine learning on multilevel substrate phage display data.

Bioinformatics. 2008 Dec 1;24(23):2691-7. doi: 10.1093/bioinformatics/btn538. Epub 2008 Oct 29.

PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy.

Bioinformatics. 2018 Feb 15;34(4):684-687. doi: 10.1093/bioinformatics/btx670.

Deconvolving multiplexed protease signatures with substrate reduction and activity clustering.

PLoS Comput Biol. 2019 Sep 3;15(9):e1006909. doi: 10.1371/journal.pcbi.1006909. eCollection 2019 Sep.

引用本文的文献

Approaches to Avoid Proteolysis During Protein Expression and Purification.

Methods Mol Biol. 2023;2699:77-95. doi: 10.1007/978-1-0716-3362-5_6.

Predicting Structural Susceptibility of Proteins to Proteolytic Processing.

Int J Mol Sci. 2023 Jun 28;24(13):10761. doi: 10.3390/ijms241310761.

In-Depth Characterization of Apoptosis N-Terminome Reveals a Link Between Caspase-3 Cleavage and Posttranslational N-Terminal Acetylation.

Mol Cell Proteomics. 2023 Jul;22(7):100584. doi: 10.1016/j.mcpro.2023.100584. Epub 2023 May 24.

Proteomic data and structure analysis combined reveal interplay of structural rigidity and flexibility on selectivity of cysteine cathepsins.

Commun Biol. 2023 Apr 24;6(1):450. doi: 10.1038/s42003-023-04772-8.

Deorphanizing Caspase-3 and Caspase-9 Substrates In and Out of Apoptosis with Deep Substrate Profiling.

ACS Chem Biol. 2021 Nov 19;16(11):2280-2296. doi: 10.1021/acschembio.1c00456. Epub 2021 Sep 23.

An automated protocol for modelling peptide substrates to proteases.

BMC Bioinformatics. 2020 Dec 29;21(1):586. doi: 10.1186/s12859-020-03931-6.

Deep profiling of protease substrate specificity enabled by dual random and scanned human proteome substrate phage libraries.

Proc Natl Acad Sci U S A. 2020 Oct 13;117(41):25464-25475. doi: 10.1073/pnas.2009279117. Epub 2020 Sep 24.

Novel Apoptotic Mediators Identified by Conservation of Vertebrate Caspase Targets.

Biomolecules. 2020 Apr 15;10(4):612. doi: 10.3390/biom10040612.

Specificity for latent C termini links the E3 ubiquitin ligase CHIP to caspases.

Nat Chem Biol. 2019 Aug;15(8):786-794. doi: 10.1038/s41589-019-0322-6. Epub 2019 Jul 18.

Predicting CD4 T-cell epitopes based on antigen cleavage, MHCII presentation, and TCR recognition.

PLoS One. 2018 Nov 6;13(11):e0206654. doi: 10.1371/journal.pone.0206654. eCollection 2018.

本文引用的文献

Structural and kinetic determinants of protease substrates.

Nat Struct Mol Biol. 2009 Oct;16(10):1101-8. doi: 10.1038/nsmb.1668. Epub 2009 Sep 20.

Leukocyte recruitment and ischemic brain injury.

Neuromolecular Med. 2010 Jun;12(2):193-204. doi: 10.1007/s12017-009-8074-1. Epub 2009 Jul 5.

SitePredicting the cleavage of proteinase substrates.

Trends Biochem Sci. 2009 Jul;34(7):319-23. doi: 10.1016/j.tibs.2009.04.001. Epub 2009 Jun 21.

Predicting affinity and specificity of antigenic peptide binding to major histocompatibility class I molecules.

Curr Protein Pept Sci. 2009 Jun;10(3):286-96. doi: 10.2174/138920309788452191.

The biology of cytotoxic cell granule exocytosis pathway: granzymes have evolved to induce cell death and inflammation.

Microbes Infect. 2009 Apr;11(4):452-9. doi: 10.1016/j.micinf.2009.02.004. Epub 2009 Feb 26.

Characterization of domain-peptide interaction interface: a generic structure-based model to decipher the binding specificity of SH3 domains.

Mol Cell Proteomics. 2009 Apr;8(4):639-49. doi: 10.1074/mcp.M800450-MCP200. Epub 2008 Nov 20.

MODBASE, a database of annotated comparative protein structure models and associated resources.

Nucleic Acids Res. 2009 Jan;37(Database issue):D347-54. doi: 10.1093/nar/gkn791. Epub 2008 Oct 23.

Analysis of protein processing by N-terminal proteomics reveals novel species-specific substrate determinants of granzyme B orthologs.

Mol Cell Proteomics. 2009 Feb;8(2):258-72. doi: 10.1074/mcp.M800060-MCP200. Epub 2008 Oct 3.

How well can the accuracy of comparative protein structure models be predicted?

Protein Sci. 2008 Nov;17(11):1881-93. doi: 10.1110/ps.036061.108. Epub 2008 Oct 1.

Caspase cleavage is not for everyone.

Cell. 2008 Sep 5;134(5):720-1. doi: 10.1016/j.cell.2008.08.019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用序列和结构特征预测蛋白酶底物。

Prediction of protease substrates using sequence and structure features.

机构信息

Graduate Group in Bioinformatics, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA.

出版信息

Bioinformatics. 2010 Jul 15;26(14):1714-22. doi: 10.1093/bioinformatics/btq267. Epub 2010 May 26.

DOI:10.1093/bioinformatics/btq267

PMID:20505003

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2894511/

Abstract

MOTIVATION

RESULTS

AVAILABILITY

摘要

利用序列和结构特征预测蛋白酶底物。

Prediction of protease substrates using sequence and structure features.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

利用序列和结构特征预测蛋白酶底物。

Prediction of protease substrates using sequence and structure features.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性