• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

C10Pred:一种基于机器学习的工具,用于使用序列衍生特征预测 C10 家族半胱氨酸肽酶。

C10Pred: A First Machine Learning Based Tool to Predict C10 Family Cysteine Peptidases Using Sequence-Derived Features.

机构信息

Institute of Intelligence Informatics Technology, Sangmyung University, Seoul 03016, Korea.

Department of Pediatrics, Washington University in St. Louis, St. Louis, MO 63110, USA.

出版信息

Int J Mol Sci. 2022 Aug 23;23(17):9518. doi: 10.3390/ijms23179518.

DOI:10.3390/ijms23179518
PMID:36076915
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9455582/
Abstract

, or group A (GAS), a gram-positive bacterium, is implicated in a wide range of clinical manifestations and life-threatening diseases. One of the key virulence factors of GAS is streptopain, a C10 family cysteine peptidase. Since its discovery, various homologs of streptopain have been reported from other bacterial species. With the increased affordability of sequencing, a significant increase in the number of potential C10 family-like sequences in the public databases is anticipated, posing a challenge in classifying such sequences. Sequence-similarity-based tools are the methods of choice to identify such streptopain-like sequences. However, these methods depend on some level of sequence similarity between the existing C10 family and the target sequences. Therefore, in this work, we propose a novel predictor, C10Pred, for the prediction of C10 peptidases using sequence-derived optimal features. C10Pred is a support vector machine (SVM) based model which is efficient in predicting C10 enzymes with an overall accuracy of 92.7% and Matthews' correlation coefficient (MCC) value of 0.855 when tested on an independent dataset. We anticipate that C10Pred will serve as a handy tool to classify novel streptopain-like proteins belonging to the C10 family and offer essential information.

摘要

,或 A 组链球菌(GAS),是一种革兰氏阳性细菌,与广泛的临床表现和危及生命的疾病有关。GAS 的关键毒力因子之一是链霉蛋白酶,一种 C10 家族半胱氨酸蛋白酶。自发现以来,其他细菌物种中也报道了各种链霉蛋白酶的同源物。随着测序成本的降低,预计公共数据库中潜在的 C10 家族样序列的数量将大幅增加,这对分类此类序列构成了挑战。基于序列相似性的工具是识别此类链霉蛋白酶样序列的首选方法。然而,这些方法依赖于现有 C10 家族和目标序列之间一定程度的序列相似性。因此,在这项工作中,我们提出了一种新的预测器 C10Pred,用于使用序列衍生的最佳特征预测 C10 肽酶。C10Pred 是一种基于支持向量机(SVM)的模型,在独立数据集上测试时,其整体准确性为 92.7%,马修斯相关系数(MCC)值为 0.855,可有效预测 C10 酶。我们预计 C10Pred 将成为一种方便的工具,用于分类属于 C10 家族的新型链霉蛋白酶样蛋白,并提供必要的信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/38aa074fa15d/ijms-23-09518-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/8455c26dd8b7/ijms-23-09518-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/b9f02a09a07d/ijms-23-09518-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/5cea9c9ec1e8/ijms-23-09518-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/baba4e7aae1e/ijms-23-09518-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/3cb20d5cff70/ijms-23-09518-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/b5aacf60210a/ijms-23-09518-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/38aa074fa15d/ijms-23-09518-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/8455c26dd8b7/ijms-23-09518-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/b9f02a09a07d/ijms-23-09518-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/5cea9c9ec1e8/ijms-23-09518-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/baba4e7aae1e/ijms-23-09518-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/3cb20d5cff70/ijms-23-09518-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/b5aacf60210a/ijms-23-09518-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1233/9455582/38aa074fa15d/ijms-23-09518-g007.jpg

相似文献

1
C10Pred: A First Machine Learning Based Tool to Predict C10 Family Cysteine Peptidases Using Sequence-Derived Features.C10Pred:一种基于机器学习的工具,用于使用序列衍生特征预测 C10 家族半胱氨酸肽酶。
Int J Mol Sci. 2022 Aug 23;23(17):9518. doi: 10.3390/ijms23179518.
2
APLpred: A machine learning-based tool for accurate prediction and characterization of asparagine peptide lyases using sequence-derived optimal features.APLpred:一种基于机器学习的工具,可使用源自序列的最佳特征准确预测和表征天冬酰胺肽裂解酶。
Methods. 2024 Sep;229:133-146. doi: 10.1016/j.ymeth.2024.05.014. Epub 2024 Jun 28.
3
Highly efficient recombinant production and purification of streptococcal cysteine protease streptopain with increased enzymatic activity.高效重组生产及纯化具有更高酶活性的链球菌半胱氨酸蛋白酶链激酶。
Protein Expr Purif. 2016 May;121:66-72. doi: 10.1016/j.pep.2016.01.002. Epub 2016 Jan 7.
4
Evolutionary lines of cysteine peptidases.半胱氨酸肽酶的进化谱系。
Biol Chem. 2001 May;382(5):727-33. doi: 10.1515/BC.2001.088.
5
Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features.利用序列距离和其他基于序列的特征预测氧化还原敏感型半胱氨酸。
BMC Bioinformatics. 2016 Aug 24;17(1):316. doi: 10.1186/s12859-016-1185-4.
6
Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning.利用机器学习从序列归因特征识别和表征质体型蛋白。
BMC Bioinformatics. 2013;14 Suppl 14(Suppl 14):S7. doi: 10.1186/1471-2105-14-S14-S7. Epub 2013 Oct 9.
7
Prediction of RNA-binding amino acids from protein and RNA sequences.从蛋白质和 RNA 序列预测 RNA 结合氨基酸。
BMC Bioinformatics. 2011;12 Suppl 13(Suppl 13):S7. doi: 10.1186/1471-2105-12-S13-S7. Epub 2011 Nov 30.
8
lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.lncRScan-SVM:一种使用支持向量机预测长链非编码RNA的工具。
PLoS One. 2015 Oct 5;10(10):e0139654. doi: 10.1371/journal.pone.0139654. eCollection 2015.
9
Capreomycin resistance prediction in two species of Mycobacterium using a stacked ensemble method.利用堆叠集成方法预测两种分枝杆菌中的卷曲霉素耐药性。
J Appl Microbiol. 2019 Dec;127(6):1656-1664. doi: 10.1111/jam.14413. Epub 2019 Sep 8.
10
Prediction of the bonding states of cysteines using the support vector machines based on multiple feature vectors and cysteine state sequences.基于多特征向量和半胱氨酸状态序列,使用支持向量机预测半胱氨酸的结合状态。
Proteins. 2004 Jun 1;55(4):1036-42. doi: 10.1002/prot.20079.

引用本文的文献

1
Prediction and validation of nanowire proteins in G20 using machine learning and feature engineering.使用机器学习和特征工程对G20中的纳米线蛋白进行预测与验证。
Comput Struct Biotechnol J. 2025 Apr 19;27:1706-1718. doi: 10.1016/j.csbj.2025.04.022. eCollection 2025.

本文引用的文献

1
Identifying Key MicroRNA Signatures for Neurodegenerative Diseases With Machine Learning Methods.运用机器学习方法识别神经退行性疾病的关键微小RNA特征
Front Genet. 2022 Apr 21;13:880997. doi: 10.3389/fgene.2022.880997. eCollection 2022.
2
Apparent Diffusion Coefficient Map-Based Texture Analysis for the Differentiation of Chromophobe Renal Cell Carcinoma from Renal Oncocytoma.基于表观扩散系数图的纹理分析在鉴别嫌色性肾细胞癌与肾嗜酸细胞瘤中的应用
Diagnostics (Basel). 2022 Mar 26;12(4):817. doi: 10.3390/diagnostics12040817.
3
Genomic Insights into the Distribution of Peptidases and Proteolytic Capacity among and Species.
基因组视角下 和 种属中肽酶与蛋白水解能力的分布。
Microbiol Spectr. 2022 Apr 27;10(2):e0218521. doi: 10.1128/spectrum.02185-21. Epub 2022 Apr 4.
4
A hybrid machine learning/deep learning COVID-19 severity predictive model from CT images and clinical data.基于 CT 图像和临床数据的机器学习/深度学习 COVID-19 严重程度预测模型的混合模型。
Sci Rep. 2022 Mar 14;12(1):4329. doi: 10.1038/s41598-022-07890-1.
5
Protease activities of vaginal Porphyromonas species disrupt coagulation and extracellular matrix in the cervicovaginal niche.阴道卟啉单胞菌属的蛋白酶活性破坏宫颈阴道腔隙中的凝血和细胞外基质。
NPJ Biofilms Microbiomes. 2022 Feb 21;8(1):8. doi: 10.1038/s41522-022-00270-7.
6
ILeukin10Pred: A Computational Approach for Predicting IL-10-Inducing Immunosuppressive Peptides Using Combinations of Amino Acid Global Features.白细胞介素10预测:一种利用氨基酸全局特征组合预测白细胞介素10诱导免疫抑制肽的计算方法。
Biology (Basel). 2021 Dec 21;11(1):5. doi: 10.3390/biology11010005.
7
SortPred: The first machine learning based predictor to identify bacterial sortases and their classes using sequence-derived information.SortPred:首个基于机器学习的预测工具,可利用序列衍生信息识别细菌分选酶及其类别。
Comput Struct Biotechnol J. 2021 Dec 14;20:165-174. doi: 10.1016/j.csbj.2021.12.014. eCollection 2022.
8
eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale.eggNOG-mapper v2:宏基因组尺度的功能注释、直系同源物分配和结构域预测。
Mol Biol Evol. 2021 Dec 9;38(12):5825-5829. doi: 10.1093/molbev/msab293.
9
STALLION: a stacking-based ensemble learning framework for prokaryotic lysine acetylation site prediction.STALLION:一种基于堆叠的集成学习框架,用于预测细菌赖氨酸乙酰化位点。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab376.
10
Transcriptomics of Differential Ripening in 'd'Anjou' Pear ( L.).‘昂久’梨(L.)果实差异成熟的转录组学研究
Front Plant Sci. 2021 Jun 16;12:609684. doi: 10.3389/fpls.2021.609684. eCollection 2021.