Suppr超能文献

氨基酸特性对更准确鉴别外膜蛋白的影响。

Influence of amino acid properties for discriminating outer membrane proteins at better accuracy.

作者信息

Gromiha M Michael, Suwa Makiko

机构信息

Computational Biology Research Center, CBRC, National Institute of Advanced Industrial Science and Technology, AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Tokyo 135-0064, Japan.

出版信息

Biochim Biophys Acta. 2006 Sep;1764(9):1493-7. doi: 10.1016/j.bbapap.2006.07.005. Epub 2006 Jul 31.

Abstract

Discriminating outer membrane proteins (OMPs) from other folding types of globular and membrane proteins is an important task both for identifying outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures. In this work, we have analyzed the influence of physico-chemical, energetic and conformational properties of amino acid residues for discriminating outer membrane proteins using different machine learning algorithms, such as, Bayes rules, Logistic functions, Neural networks, Support vector machines, Decision trees, etc. We observed that most of the properties have discriminated the OMPs with similar accuracy. The neural network method with the property, free energy change could discriminate the OMPs from other folding types of globular and membrane proteins at the 5-fold cross-validation accuracy of 94.4% in a dataset of 1,088 proteins, which is better than that obtained with amino acid composition. The accuracy of discriminating globular proteins is 94.3% and that of transmembrane helical (TMH) proteins is 91.8%. Further, the neural network method is tested with globular proteins belonging to 30 major folding types and it could successfully exclude 99.4% of the considered 1612 non-redundant proteins. These accuracy levels are comparable to or better than other methods in the literature. We suggest that this method could be effectively used to discriminate OMPs and for detecting OMPs in genomic sequences.

摘要

区分外膜蛋白(OMPs)与其他折叠类型的球状蛋白和膜蛋白,对于从基因组序列中识别外膜蛋白以及成功预测其二级和三级结构而言,都是一项重要任务。在这项工作中,我们使用不同的机器学习算法,如贝叶斯规则、逻辑函数、神经网络、支持向量机、决策树等,分析了氨基酸残基的物理化学、能量和构象性质对区分外膜蛋白的影响。我们观察到,大多数性质对外膜蛋白的区分准确率相似。具有自由能变化这一性质的神经网络方法,在一个包含1088种蛋白质的数据集上,以94.4%的5折交叉验证准确率,能够将外膜蛋白与其他折叠类型的球状蛋白和膜蛋白区分开来,这比基于氨基酸组成所获得的准确率更高。区分球状蛋白的准确率为94.3%,区分跨膜螺旋(TMH)蛋白的准确率为91.8%。此外,该神经网络方法用属于30种主要折叠类型的球状蛋白进行了测试,它能够成功排除所考虑的1612种非冗余蛋白中的99.4%。这些准确率水平与文献中的其他方法相当或更优。我们认为,这种方法可有效地用于区分外膜蛋白以及在基因组序列中检测外膜蛋白。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验