• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于支持向量机(SVM)的多类预测及纤溶酶原激活剂的基本统计分析

Support vector machine (SVM) based multiclass prediction with basic statistical analysis of plasminogen activators.

作者信息

Muthukrishnan Selvaraj, Puri Munish, Lefevre Christophe

机构信息

Fermentation and Protein Biotechnology Laboratory, Department of Biotechnology, Punjabi University, Patiala, India, 2CSIR-IMTECH, Chandigarh, India.

出版信息

BMC Res Notes. 2014 Jan 27;7:63. doi: 10.1186/1756-0500-7-63.

DOI:10.1186/1756-0500-7-63
PMID:24468032
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3924408/
Abstract

BACKGROUND

Plasminogen (Pg), the precursor of the proteolytic and fibrinolytic enzyme of blood, is converted to the active enzyme plasmin (Pm) by different plasminogen activators (tissue plasminogen activators and urokinase), including the bacterial activators streptokinase and staphylokinase, which activate Pg to Pm and thus are used clinically for thrombolysis. The identification of Pg-activators is therefore an important step in understanding their functional mechanism and derives new therapies.

METHODS

In this study, different computational methods for predicting plasminogen activator peptide sequences with high accuracy were investigated, including support vector machines (SVM) based on amino acid (AC), dipeptide composition (DC), PSSM profile and Hybrid methods used to predict different Pg-activators from both prokaryotic and eukaryotic origins.

RESULTS

Overall maximum accuracy, evaluated using the five-fold cross validation technique, was 88.37%, 84.32%, 87.61%, 85.63% in 0.87, 0.83,0.86 and 0.85 MCC with amino (AC) or dipeptide composition (DC), PSSM profile and Hybrid methods respectively. Through this study, we have found that the different subfamilies of Pg-activators are quite closely correlated in terms of amino, dipeptide, PSSM and Hybrid compositions. Therefore, our prediction results show that plasminogen activators are predictable with a high accuracy from their primary sequence. Prediction performance was also cross-checked by confusion matrix and ROC (Receiver operating characteristics) analysis. A web server to facilitate the prediction of Pg-activators from primary sequence data was implemented.

CONCLUSION

The results show that dipeptide, PSSM profile, and Hybrid based methods perform better than single amino acid composition (AC). Furthermore, we also have developed a web server, which predicts the Pg-activators and their classification (available online at http://mamsap.it.deakin.edu.au/plas_pred/home.html). Our experimental results show that our approaches are faster and achieve generally a good prediction performance.

摘要

背景

纤溶酶原(Pg)是血液中蛋白水解和纤维蛋白溶解酶的前体,可被不同的纤溶酶原激活剂(组织纤溶酶原激活剂和尿激酶)转化为活性酶纤溶酶(Pm),包括细菌激活剂链激酶和葡萄球菌激酶,它们将Pg激活为Pm,因此在临床上用于溶栓。因此,鉴定Pg激活剂是理解其功能机制并开发新疗法的重要一步。

方法

在本研究中,研究了不同的高精度预测纤溶酶原激活剂肽序列的计算方法,包括基于氨基酸(AC)、二肽组成(DC)、位置特异性打分矩阵(PSSM)谱的支持向量机(SVM)以及用于预测来自原核和真核来源的不同Pg激活剂的混合方法。

结果

使用五折交叉验证技术评估的总体最大准确率,在使用氨基酸(AC)或二肽组成(DC)、PSSM谱和混合方法时,分别为88.37%、84.32%、87.61%、85.63%,马修斯相关系数(MCC)分别为0.87、0.83、0.86和0.85。通过这项研究,我们发现Pg激活剂的不同亚家族在氨基酸、二肽、PSSM和混合组成方面密切相关。因此,我们的预测结果表明,从其一级序列可以高精度预测纤溶酶原激活剂。预测性能还通过混淆矩阵和ROC(接收者操作特征)分析进行了交叉检验。实现了一个网络服务器,以方便从一级序列数据预测Pg激活剂。

结论

结果表明,基于二肽、PSSM谱和混合的方法比单一氨基酸组成(AC)表现更好。此外,我们还开发了一个网络服务器,用于预测Pg激活剂及其分类(可在http://mamsap.it.deakin.edu.au/plas_pred/home.html在线获取)。我们的实验结果表明,我们的方法更快,并且通常具有良好的预测性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/422c6addff41/1756-0500-7-63-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/f6d44b988885/1756-0500-7-63-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/ebdaa064b316/1756-0500-7-63-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/a324f3b8d3ae/1756-0500-7-63-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/d2ede5f28a92/1756-0500-7-63-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/422c6addff41/1756-0500-7-63-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/f6d44b988885/1756-0500-7-63-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/ebdaa064b316/1756-0500-7-63-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/a324f3b8d3ae/1756-0500-7-63-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/d2ede5f28a92/1756-0500-7-63-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c17/3924408/422c6addff41/1756-0500-7-63-5.jpg

相似文献

1
Support vector machine (SVM) based multiclass prediction with basic statistical analysis of plasminogen activators.基于支持向量机(SVM)的多类预测及纤溶酶原激活剂的基本统计分析
BMC Res Notes. 2014 Jan 27;7:63. doi: 10.1186/1756-0500-7-63.
2
ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.ESLpred:基于支持向量机的方法,利用二肽组成和PSI-BLAST对真核蛋白质进行亚细胞定位。
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W414-9. doi: 10.1093/nar/gkh350.
3
A machine learning based method for the prediction of secretory proteins using amino acid composition, their order and similarity-search.一种基于机器学习的方法,利用氨基酸组成、顺序和相似性搜索来预测分泌蛋白。
In Silico Biol. 2008;8(2):129-40.
4
Prediction of mitochondrial proteins of malaria parasite using split amino acid composition and PSSM profile.利用氨基酸组成拆分和 PSSM 图谱预测疟原虫的线粒体蛋白。
Amino Acids. 2010 Jun;39(1):101-10. doi: 10.1007/s00726-009-0381-1. Epub 2009 Nov 12.
5
Prediction of GTP interacting residues, dipeptides and tripeptides in a protein from its evolutionary information.从蛋白质的进化信息预测其 GTP 相互作用残基、二肽和三肽。
BMC Bioinformatics. 2010 Jun 3;11:301. doi: 10.1186/1471-2105-11-301.
6
Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.基于概率潜在语义索引的核转位信号预测核蛋白。
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.
7
Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation.通过结合支持向量机和位置特异性得分矩阵距离变换来识别DNA结合蛋白。
BMC Syst Biol. 2015;9 Suppl 1(Suppl 1):S10. doi: 10.1186/1752-0509-9-S1-S10. Epub 2015 Feb 6.
8
SVM based prediction of RNA-binding proteins using binding residues and evolutionary information.基于支持向量机的 RNA 结合蛋白结合残基和进化信息预测。
J Mol Recognit. 2011 Mar-Apr;24(2):303-13. doi: 10.1002/jmr.1061.
9
Prediction of membrane transport proteins and their substrate specificities using primary sequence information.利用一级序列信息预测膜转运蛋白及其底物特异性。
PLoS One. 2014 Jun 26;9(6):e100278. doi: 10.1371/journal.pone.0100278. eCollection 2014.
10
Mechanism of action of omega-amino acids on plasminogen activation and fibrinolysis induced by staphylokinase.ω-氨基酸对葡萄球菌激酶诱导的纤溶酶原激活和纤维蛋白溶解的作用机制。
Biochemistry (Mosc). 2007 Jul;72(7):707-15. doi: 10.1134/s0006297907070048.

引用本文的文献

1
Computational method for aromatase-related proteins using machine learning approach.基于机器学习的芳香化酶相关蛋白计算方法。
PLoS One. 2023 Mar 29;18(3):e0283567. doi: 10.1371/journal.pone.0283567. eCollection 2023.
2
Ion-pumping microbial rhodopsin protein classification by machine learning approach.基于机器学习方法的离子泵微生物视紫红质蛋白分类。
BMC Bioinformatics. 2023 Jan 27;24(1):29. doi: 10.1186/s12859-023-05138-x.
3
Distinguishing Glioblastoma Subtypes by Methylation Signatures.通过甲基化特征区分胶质母细胞瘤亚型

本文引用的文献

1
Analysis and prediction of cancerlectins using evolutionary and domain information.利用进化和结构域信息对癌凝集素进行分析与预测
BMC Res Notes. 2011 Jul 20;4:237. doi: 10.1186/1756-0500-4-237.
2
MHCBN 4.0: A database of MHC/TAP binding peptides and T-cell epitopes.MHCBN 4.0:一个主要组织相容性复合体/抗原加工相关转运体结合肽和T细胞表位的数据库。
BMC Res Notes. 2009 Apr 20;2:61. doi: 10.1186/1756-0500-2-61.
3
Comparative analysis of complete genome sequences of three avian coronaviruses reveals a novel group 3c coronavirus.三种禽冠状病毒全基因组序列的比较分析揭示了一种新型3c组冠状病毒。
Front Genet. 2020 Nov 24;11:604336. doi: 10.3389/fgene.2020.604336. eCollection 2020.
4
Harnessing the evolutionary information on oxygen binding proteins through Support Vector Machines based modules.通过基于支持向量机的模块利用氧结合蛋白的进化信息。
BMC Res Notes. 2018 May 11;11(1):290. doi: 10.1186/s13104-018-3383-9.
5
Point-of-care testing in the early diagnosis of acute pesticide intoxication: The example of paraquat.即时检验在急性农药中毒早期诊断中的应用:以百草枯为例。
Biomicrofluidics. 2018 Jan 19;12(1):011501. doi: 10.1063/1.5003848. eCollection 2018 Jan.
6
BacHbpred: Support Vector Machine Methods for the Prediction of Bacterial Hemoglobin-Like Proteins.BacHbpred:用于预测细菌类血红蛋白蛋白的支持向量机方法。
Adv Bioinformatics. 2016;2016:8150784. doi: 10.1155/2016/8150784. Epub 2016 Feb 29.
J Virol. 2009 Jan;83(2):908-17. doi: 10.1128/JVI.01977-08. Epub 2008 Oct 29.
4
Oxypred: prediction and classification of oxygen-binding proteins.Oxypred:氧结合蛋白的预测与分类
Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):250-2. doi: 10.1016/S1672-0229(08)60012-1.
5
Prediction of RNA binding sites in a protein using SVM and PSSM profile.使用支持向量机和位置特异性得分矩阵预测蛋白质中的RNA结合位点。
Proteins. 2008 Apr;71(1):189-94. doi: 10.1002/prot.21677.
6
Support Vector Machine-based method for predicting subcellular localization of mycobacterial proteins using evolutionary information and motifs.基于支持向量机的方法,利用进化信息和基序预测分枝杆菌蛋白质的亚细胞定位
BMC Bioinformatics. 2007 Sep 13;8:337. doi: 10.1186/1471-2105-8-337.
7
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.Cd-hit:一个用于对大量蛋白质或核苷酸序列进行聚类和比较的快速程序。
Bioinformatics. 2006 Jul 1;22(13):1658-9. doi: 10.1093/bioinformatics/btl158. Epub 2006 May 26.
8
Prediction of mitochondrial proteins using support vector machine and hidden Markov model.使用支持向量机和隐马尔可夫模型预测线粒体蛋白质。
J Biol Chem. 2006 Mar 3;281(9):5357-63. doi: 10.1074/jbc.M511061200. Epub 2005 Dec 8.
9
Plasminogen activators: a comparison.纤溶酶原激活剂:一项比较。
Vascul Pharmacol. 2006 Jan;44(1):1-9. doi: 10.1016/j.vph.2005.09.003. Epub 2005 Nov 7.
10
Structure and function of the plasminogen/plasmin system.纤溶酶原/纤溶酶系统的结构与功能。
Thromb Haemost. 2005 Apr;93(4):647-54. doi: 10.1160/TH04-12-0842.