• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过整合一系列蛋白质生物学特征,利用伪氨基酸组成预测蛋白酶家族。

Using pseudo amino acid composition to predict protease families by incorporating a series of protein biological features.

作者信息

Hu Lele, Zheng Lulu, Wang Zhiwen, Li Bing, Liu Lei

机构信息

Institute of Systems Biology, Shanghai University, Shanghai 200444, China.

出版信息

Protein Pept Lett. 2011 Jun;18(6):552-8. doi: 10.2174/092986611795222795.

DOI:10.2174/092986611795222795
PMID:21271978
Abstract

Proteases are essential to most biological processes though they themselves remain intact during the processes. In this research, a computational approach was developed for predicting the families of proteases based on their sequences. According to the concept of pseudo amino acid composition, in order to catch the essential patterns for the sequences of proteases, the sample of a protein was formulated by a series of its biological features. There were a total of 132 biological features, which were sourced from various biochemical and physicochemical properties of the constituent amino acids. The importance of these features to the prediction is rated by Maximum Relevance Minimum Redundancy algorithm and then the Incremental Feature Selection was applied to select an optimal feature set, which was used to construct a predictor through the nearest neighbor algorithm. As a demonstration, the overall success rate by the jackknife test in identifying proteases among their seven families was 92.74%. It was revealed by further analysis on the optimal feature set that the secondary structure and amino acid composition play the key roles for the classification, which is quite consistent with some previous findings. The promising results imply that the predictor as presented in this paper may become a useful tool for studying proteases.

摘要

蛋白酶对大多数生物过程至关重要,尽管它们在这些过程中自身保持完整。在本研究中,开发了一种基于序列预测蛋白酶家族的计算方法。根据伪氨基酸组成的概念,为捕捉蛋白酶序列的基本模式,蛋白质样本由其一系列生物学特征构成。共有132个生物学特征,这些特征源自组成氨基酸的各种生化和物理化学性质。通过最大相关最小冗余算法评估这些特征对预测的重要性,然后应用增量特征选择来选择最优特征集,该特征集用于通过最近邻算法构建预测器。作为例证,留一法检验在识别七个家族的蛋白酶时的总体成功率为92.74%。对最优特征集的进一步分析表明,二级结构和氨基酸组成在分类中起关键作用,这与一些先前的发现相当一致。这些有前景的结果表明,本文提出的预测器可能成为研究蛋白酶的有用工具。

相似文献

1
Using pseudo amino acid composition to predict protease families by incorporating a series of protein biological features.通过整合一系列蛋白质生物学特征,利用伪氨基酸组成预测蛋白酶家族。
Protein Pept Lett. 2011 Jun;18(6):552-8. doi: 10.2174/092986611795222795.
2
Prediction of protease types in a hybridization space.杂交空间中蛋白酶类型的预测。
Biochem Biophys Res Commun. 2006 Jan 20;339(3):1015-20. doi: 10.1016/j.bbrc.2005.10.196. Epub 2005 Nov 9.
3
Identification of proteases and their types.蛋白酶及其类型的鉴定。
Anal Biochem. 2009 Feb 1;385(1):153-60. doi: 10.1016/j.ab.2008.10.020. Epub 2008 Nov 1.
4
Predicting protease types by hybridizing gene ontology and pseudo amino acid composition.通过基因本体论与伪氨基酸组成的杂交预测蛋白酶类型。
Proteins. 2006 May 15;63(3):681-4. doi: 10.1002/prot.20898.
5
Predicting protein subcellular locations with feature selection and analysis.通过特征选择与分析预测蛋白质亚细胞定位。
Protein Pept Lett. 2010 Apr;17(4):464-72. doi: 10.2174/092986610790963654.
6
Prediction of Golgi-resident protein types using general form of Chou's pseudo-amino acid compositions: Approaches with minimal redundancy maximal relevance feature selection.基于周氏伪氨基酸组成的一般形式预测高尔基体驻留蛋白类型:采用最小冗余最大相关特征选择的方法
J Theor Biol. 2016 Aug 7;402:38-44. doi: 10.1016/j.jtbi.2016.04.032. Epub 2016 May 4.
7
Predicting transcriptional activity of multiple site p53 mutants based on hybrid properties.基于混合特性预测多个 p53 突变位点的转录活性。
PLoS One. 2011;6(8):e22940. doi: 10.1371/journal.pone.0022940. Epub 2011 Aug 8.
8
Prediction of tyrosine sulfation with mRMR feature selection and analysis.酪氨酸硫酸化的预测与 mRMR 特征选择和分析。
J Proteome Res. 2010 Dec 3;9(12):6490-7. doi: 10.1021/pr1007152. Epub 2010 Nov 11.
9
Prediction of active sites of enzymes by maximum relevance minimum redundancy (mRMR) feature selection.通过最大相关最小冗余(mRMR)特征选择预测酶的活性位点。
Mol Biosyst. 2013 Jan 27;9(1):61-9. doi: 10.1039/c2mb25327e. Epub 2012 Nov 2.
10
Prediction of lysine ubiquitination with mRMR feature selection and analysis.赖氨酸泛素化预测:基于 mRMR 特征选择与分析。
Amino Acids. 2012 Apr;42(4):1387-95. doi: 10.1007/s00726-011-0835-0. Epub 2011 Jan 26.

引用本文的文献

1
Comprehensive comparative analysis and identification of RNA-binding protein domains: multi-class classification and feature selection.RNA结合蛋白结构域的综合比较分析与鉴定:多类分类与特征选择
J Theor Biol. 2012 Nov 7;312:65-75. doi: 10.1016/j.jtbi.2012.07.013. Epub 2012 Aug 3.
2
A multi-label predictor for identifying the subcellular locations of singleplex and multiplex eukaryotic proteins.一种用于识别单plex 和 multiplex 真核蛋白质亚细胞位置的多标签预测器。
PLoS One. 2012;7(5):e36317. doi: 10.1371/journal.pone.0036317. Epub 2012 May 22.
3
Identification of colorectal cancer related genes with mRMR and shortest path in protein-protein interaction network.
基于蛋白质相互作用网络的 mRMR 和最短路径识别结直肠癌相关基因。
PLoS One. 2012;7(4):e33393. doi: 10.1371/journal.pone.0033393. Epub 2012 Apr 4.
4
iNR-PhysChem: a sequence-based predictor for identifying nuclear receptors and their subfamilies via physical-chemical property matrix.iNR-PhysChem:一种基于序列的预测器,通过物理化学性质矩阵来识别核受体及其亚家族。
PLoS One. 2012;7(2):e30869. doi: 10.1371/journal.pone.0030869. Epub 2012 Feb 21.