Suppr超能文献

利用原始序列、预测的二级结构和进化信息改进外膜 β 桶蛋白的鉴定。

Improved identification of outer membrane beta barrel proteins using primary sequence, predicted secondary structure, and evolutionary information.

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada.

出版信息

Proteins. 2011 Jan;79(1):294-303. doi: 10.1002/prot.22882.

Abstract

Membrane proteins (MPs) are difficult to identify in genomes and to crystallize, making it hard to determine their tertiary structures. MPs could be categorized into α-helical (AMP) and outer membrane proteins which mostly include beta barrel folds (OMBBs). The AMPs are relatively easy to predict from a protein sequence because they usually include several long membrane-spanning hydrophobic α-helices. The OMBBs play important roles in cell biology, they are targeted by multiple drugs, and they are more challenging to identify as they have shorter membrane-spanning regions which lack a folding pattern, that is, as consistent as in the case of the AMPs. Hence, accurate in silico methods for prediction of OMBBs from their primary sequences are needed. We present an accurate sequence-based predictor of OMBBs, called OMBBpred, which utilizes a Support Vector Machine classifier and a custom-designed set of 34 novel numerical descriptors derived from predicted secondary structures, hydrophobicity, and evolutionary information. Our method outperforms modern existing OMBB predictors and achieves accuracy of above 98% when tested on two existing benchmark datasets and 96% on a new large dataset. OMBBpred reduces the error rates of the second best method, depending on the dataset used, by between 13 and 65%, and generates predictions with high specificity of above 96%. Our solution is a useful tool for high-throughput discovery of the OMBBs on a genome scale and can be found at http://biomine.ece. ualberta.ca/OMBBpred/OMBBpred.htm.

摘要

膜蛋白(MPs)在基因组中难以识别和结晶,因此难以确定其三级结构。MPs 可分为 α-螺旋(AMP)和外膜蛋白,后者主要包括β桶折叠(OMBBs)。AMP 可以从蛋白质序列中相对容易地预测,因为它们通常包含几个长的跨膜疏水性 α-螺旋。OMBBs 在细胞生物学中发挥着重要作用,它们是多种药物的靶点,而且由于它们的跨膜区域较短,缺乏折叠模式,因此比 AMP 更难识别。因此,需要从其一级序列准确预测 OMBB 的计算方法。我们提出了一种称为 OMBBpred 的准确基于序列的 OMBB 预测器,该预测器利用支持向量机分类器和一组从预测的二级结构、疏水性和进化信息中派生的 34 个新的数值描述符。我们的方法在两个现有的基准数据集上的准确率超过 98%,在一个新的大型数据集上的准确率达到 96%,优于现代现有的 OMBB 预测器。OMBBpred 根据使用的数据集,将第二好的方法的错误率降低了 13%至 65%,并生成了特异性超过 96%的高特异性预测。我们的解决方案是一种用于高通量发现基因组规模 OMBB 的有用工具,可在 http://biomine.ece. ualberta.ca/OMBBpred/OMBBpred.htm 找到。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验