Lampros Christos, Exarchos Themis P, Fotiadis Dimitrios I
Unit of Medical Technology and Intelligent Information Systems, Department of Computer Science, University of Ioannina, GR 45110 Ioannina, Greece.
Comput Biol Med. 2007 Sep;37(9):1211-24. doi: 10.1016/j.compbiomed.2006.10.014. Epub 2006 Dec 11.
This work describes the use of a hidden Markov model (HMM), with a reduced number of states, which simultaneously learns amino acid sequence and secondary structure for proteins of known three-dimensional structure and it is used for two tasks: protein class prediction and fold recognition. The Protein Data Bank and the annotation of the SCOP database are used for training and evaluation of the proposed HMM for a number of protein classes and folds. Results demonstrate that the reduced state-space HMM performs equivalently, or even better in some cases, on classifying proteins than a HMM trained with the amino acid sequence. The major advantage of the proposed approach is that a small number of states is employed and the training algorithm is of low complexity and thus relatively fast.
这项工作描述了一种状态数量减少的隐马尔可夫模型(HMM)的使用,该模型同时学习已知三维结构蛋白质的氨基酸序列和二级结构,并用于两项任务:蛋白质类别预测和折叠识别。蛋白质数据库和SCOP数据库的注释用于对所提出的HMM针对多种蛋白质类别和折叠进行训练和评估。结果表明,在对蛋白质进行分类时,状态空间减少的HMM与使用氨基酸序列训练的HMM表现相当,在某些情况下甚至更好。所提出方法的主要优点是使用了少量状态,并且训练算法复杂度低,因此相对较快。