• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过统计学习方法在不使用序列相似性的情况下预测新型细菌蛋白质的功能类别。

Prediction of functional class of novel bacterial proteins without the use of sequence similarity by a statistical learning method.

作者信息

Cui J, Han L Y, Cai C Z, Zheng C J, Ji Z L, Chen Y Z

机构信息

Bioinformatics and Drug Design Group, Department of Computational Science, National University of Singapore, Singapore.

出版信息

J Mol Microbiol Biotechnol. 2005;9(2):86-100. doi: 10.1159/000088839.

DOI:10.1159/000088839
PMID:16319498
Abstract

A substantial percentage of the putative protein-encoding open reading frames (ORFs) in bacterial genomes have no homolog of known function, and their function cannot be confidently assigned on the basis of sequence similarity. Methods not based on sequence similarity are needed and being developed. One method, SVMProt (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi), predicts protein functional family irrespective of sequence similarity (Nucleic Acids Res. 2003;31:3692-3697). While it has been tested on a large number of proteins, its capability for non-homologous proteins has so far been evaluated for a relatively small number of proteins, and additional tests are needed to more fully assess SVMProt. In this work, 90 novel bacterial proteins (non-homologous to known proteins) are used to evaluate the capability of SVMProt. These proteins are such that none of their homologs are in the Swiss-Prot database, their functions not clearly described in the literature, and they themselves and their homologs are not included in the training sets of SVMProt. They represent proteins whose function cannot be confidently predicted by sequence similarity methods at present. The predicted functional class of 76.7% of each of these proteins shows various levels of consistency with the literature-described function, compared to the overall accuracy of 87% for the SVMProt functional class assignment of 34,582 proteins that have at least one homolog of known function. Our study suggests that SVMProt is capable of assigning functional class for novel bacterial proteins at a level not too much lower than that of sequence alignment methods for homologous proteins.

摘要

细菌基因组中相当大比例的假定蛋白质编码开放阅读框(ORF)没有已知功能的同源物,并且无法根据序列相似性可靠地确定其功能。因此需要并正在开发不基于序列相似性的方法。一种方法是SVMProt(http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi),它可以预测蛋白质功能家族,而不考虑序列相似性(《核酸研究》,2003年;31:3692 - 3697)。虽然它已经在大量蛋白质上进行了测试,但到目前为止,其对非同源蛋白质的能力仅针对相对少量的蛋白质进行了评估,还需要更多测试来更全面地评估SVMProt。在这项工作中,使用了90种新型细菌蛋白质(与已知蛋白质无同源性)来评估SVMProt的能力。这些蛋白质在瑞士蛋白质数据库中没有同源物,其功能在文献中也没有明确描述,并且它们自身及其同源物都不包含在SVMProt的训练集中。它们代表了目前无法通过序列相似性方法可靠预测功能的蛋白质。与对34582种具有至少一种已知功能同源物的蛋白质进行SVMProt功能分类的总体准确率87%相比,这些蛋白质中每种蛋白质的76.7%的预测功能类别与文献描述的功能显示出不同程度的一致性。我们的研究表明,SVMProt能够为新型细菌蛋白质分配功能类别,其水平与同源蛋白质的序列比对方法相比不会低太多。

相似文献

1
Prediction of functional class of novel bacterial proteins without the use of sequence similarity by a statistical learning method.通过统计学习方法在不使用序列相似性的情况下预测新型细菌蛋白质的功能类别。
J Mol Microbiol Biotechnol. 2005;9(2):86-100. doi: 10.1159/000088839.
2
Prediction of functional class of novel viral proteins by a statistical learning method irrespective of sequence similarity.一种不依赖序列相似性的统计学习方法对新型病毒蛋白功能类别的预测
Virology. 2005 Jan 5;331(1):136-43. doi: 10.1016/j.virol.2004.10.020.
3
Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach.预测与序列相似性无关的新型酶的功能家族:一种统计学习方法。
Nucleic Acids Res. 2004 Dec 7;32(21):6437-44. doi: 10.1093/nar/gkh984. Print 2004.
4
Prediction of functional class of novel plant proteins by a statistical learning method.利用统计学习方法预测新型植物蛋白的功能类别。
New Phytol. 2005 Oct;168(1):109-21. doi: 10.1111/j.1469-8137.2005.01482.x.
5
Prediction of the functional class of metal-binding proteins from sequence derived physicochemical properties by support vector machine approach.基于支持向量机方法,通过序列衍生的物理化学性质预测金属结合蛋白的功能类别。
BMC Bioinformatics. 2006 Dec 18;7 Suppl 5(Suppl 5):S13. doi: 10.1186/1471-2105-7-S5-S13.
6
Enzyme family classification by support vector machines.基于支持向量机的酶家族分类
Proteins. 2004 Apr 1;55(1):66-76. doi: 10.1002/prot.20045.
7
SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence.SVM-Prot:基于网络的支持向量机软件,用于根据蛋白质一级序列进行功能分类。
Nucleic Acids Res. 2003 Jul 1;31(13):3692-7. doi: 10.1093/nar/gkg600.
8
Prediction of functional class of the SARS coronavirus proteins by a statistical learning method.用统计学习方法预测严重急性呼吸综合征冠状病毒蛋白的功能类别。
J Proteome Res. 2005 Sep-Oct;4(5):1855-62. doi: 10.1021/pr050110a.
9
Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity.从序列衍生特性预测脂质结合蛋白的功能类别,而不考虑序列相似性。
J Lipid Res. 2006 Apr;47(4):824-31. doi: 10.1194/jlr.M500530-JLR200. Epub 2006 Jan 27.
10
Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity.机器学习方法在预测与序列相似性无关的蛋白质功能类别应用中的最新进展。
Proteomics. 2006 Jul;6(14):4023-37. doi: 10.1002/pmic.200500938.

引用本文的文献

1
Novel immune-modulator identified by a rapid, functional screen of the parapoxvirus ovis (Orf virus) genome.通过对羊痘病毒(口疮病毒)基因组的快速功能筛选鉴定出新型免疫调节剂。
Proteome Sci. 2012 Jan 13;10(1):4. doi: 10.1186/1477-5956-10-4.
2
Protective antigens against glanders identified by expression library immunization.通过表达文库免疫鉴定出的抗鼻疽保护性抗原。
Front Microbiol. 2011 Nov 21;2:227. doi: 10.3389/fmicb.2011.00227. eCollection 2011.
3
Prediction of functional class of proteins and peptides irrespective of sequence homology by support vector machines.
利用支持向量机预测蛋白质和肽的功能类别,而不考虑序列同源性。
Bioinform Biol Insights. 2009 Nov 24;1:19-47. doi: 10.4137/bbi.s315.
4
Accurate prediction of secreted substrates and identification of a conserved putative secretion signal for type III secretion systems.准确预测分泌底物并鉴定III型分泌系统保守的假定分泌信号。
PLoS Pathog. 2009 Apr;5(4):e1000375. doi: 10.1371/journal.ppat.1000375. Epub 2009 Apr 24.
5
PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence.PROFEAT:一个用于从氨基酸序列计算蛋白质和肽的结构及物理化学特征的网络服务器。
Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W32-7. doi: 10.1093/nar/gkl305.