使用最大熵模型通过单序列预测蛋白质二级结构。

Using maximum entropy model to predict protein secondary structure with single sequence.

作者信息

Ding Yong-Sheng, Zhang Tong-Liang, Gu Quan, Zhao Pei-Ying, Chou Kuo-Chen

机构信息

College of Information Sciences and Technology, Donghua University, Shanghai, China.

出版信息

Protein Pept Lett. 2009;16(5):552-60. doi: 10.2174/092986609788167833.

DOI:10.2174/092986609788167833

PMID:19442235

Abstract

Prediction of protein secondary structure is somewhat reminiscent of the efforts by many previous investigators but yet still worthy of revisiting it owing to its importance in protein science. Several studies indicate that the knowledge of protein structural classes can provide useful information towards the determination of protein secondary structure. Particularly, the performance of prediction algorithms developed recently have been improved rapidly by incorporating homologous multiple sequences alignment information. Unfortunately, this kind of information is not available for a significant amount of proteins. In view of this, it is necessary to develop the method based on the query protein sequence alone, the so-called single-sequence method. Here, we propose a novel single-sequence approach which is featured by that various kinds of contextual information are taken into account, and that a maximum entropy model classifier is used as the prediction engine. As a demonstration, cross-validation tests have been performed by the new method on datasets containing proteins from different structural classes, and the results thus obtained are quite promising, indicating that the new method may become an useful tool in protein science or at least play a complementary role to the existing protein secondary structure prediction methods.

摘要

蛋白质二级结构预测在某种程度上让人想起许多先前研究者所做的努力，但由于其在蛋白质科学中的重要性，仍值得重新审视。多项研究表明，蛋白质结构类别的知识可为确定蛋白质二级结构提供有用信息。特别是，通过纳入同源多序列比对信息，最近开发的预测算法的性能得到了迅速提升。不幸的是，对于大量蛋白质而言，此类信息并不存在。鉴于此，有必要开发仅基于查询蛋白质序列的方法，即所谓的单序列方法。在此，我们提出一种新颖的单序列方法，其特点是考虑了各种上下文信息，并使用最大熵模型分类器作为预测引擎。作为演示，新方法已在包含来自不同结构类别的蛋白质的数据集上进行了交叉验证测试，所得结果颇具前景，表明新方法可能成为蛋白质科学中的一种有用工具，或者至少对现有的蛋白质二级结构预测方法起到补充作用。

相似文献

Using maximum entropy model to predict protein secondary structure with single sequence.

Protein Pept Lett. 2009;16(5):552-60. doi: 10.2174/092986609788167833.

Prediction of protein structural classes for low-homology sequences based on predicted secondary structure.

BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-11-S1-S9.

Using Chou's pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location.

Amino Acids. 2008 May;34(4):669-75. doi: 10.1007/s00726-008-0034-9. Epub 2008 Feb 7.

[Protein secondary structure prediction based on maximum entropy model].

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2008 Apr;25(2):259-63.

High-accuracy prediction of protein structural class for low-similarity sequences based on predicted secondary structure.

Biochimie. 2011 Apr;93(4):710-4. doi: 10.1016/j.biochi.2011.01.001. Epub 2011 Jan 13.

A high-accuracy protein structural class prediction algorithm using predicted secondary structural information.

J Theor Biol. 2010 Dec 7;267(3):272-5. doi: 10.1016/j.jtbi.2010.09.007. Epub 2010 Sep 8.

Prediction of protein structural class using a complexity-based distance measure.

Amino Acids. 2010 Mar;38(3):721-8. doi: 10.1007/s00726-009-0276-1. Epub 2009 Mar 28.

A seqlet-based maximum entropy Markov approach for protein secondary structure prediction.

Sci China C Life Sci. 2005 Aug;48(4):394-405. doi: 10.1360/062004-53.

Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.

J Theor Biol. 2016 Jul 7;400:1-10. doi: 10.1016/j.jtbi.2016.04.011. Epub 2016 Apr 12.

Improving protein secondary structure prediction using a multi-modal BP method.

Comput Biol Med. 2011 Oct;41(10):946-59. doi: 10.1016/j.compbiomed.2011.08.005. Epub 2011 Aug 30.

引用本文的文献

Gene ontology based transfer learning for protein subcellular localization.

BMC Bioinformatics. 2011 Feb 2;12:44. doi: 10.1186/1471-2105-12-44.

Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences.

BMC Bioinformatics. 2009 Dec 13;10:414. doi: 10.1186/1471-2105-10-414.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用最大熵模型通过单序列预测蛋白质二级结构。

Using maximum entropy model to predict protein secondary structure with single sequence.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献