Lampros Christos, Simos Thomas, Exarchos Themis P, Exarchos Konstantinos P, Papaloukas Costas, Fotiadis Dimitrios I
Department of Materials Science and Engineering, Unit of Medical Technology and Intelligent Information Systems, University of Ioannina, GR 45110 Ioannina, Greece.
J Bioinform Comput Biol. 2014 Aug;12(4):1450016. doi: 10.1142/S0219720014500164. Epub 2014 Jul 14.
Protein fold classification is a challenging task strongly associated with the determination of proteins' structure. In this work, we tested an optimization strategy on a Markov chain and a recently introduced Hidden Markov Model (HMM) with reduced state-space topology. The proteins with unknown structure were scored against both these models. Then the derived scores were optimized following a local optimization method. The Protein Data Bank (PDB) and the annotation of the Structural Classification of Proteins (SCOP) database were used for the evaluation of the proposed methodology. The results demonstrated that the fold classification accuracy of the optimized HMM was substantially higher compared to that of the Markov chain or the reduced state-space HMM approaches. The proposed methodology achieved an accuracy of 41.4% on fold classification, while Sequence Alignment and Modeling (SAM), which was used for comparison, reached an accuracy of 38%.
蛋白质折叠分类是一项与蛋白质结构测定密切相关的具有挑战性的任务。在这项工作中,我们在马尔可夫链和最近引入的具有简化状态空间拓扑结构的隐马尔可夫模型(HMM)上测试了一种优化策略。针对这两种模型对结构未知的蛋白质进行评分。然后,按照局部优化方法对得出的分数进行优化。使用蛋白质数据库(PDB)和蛋白质结构分类(SCOP)数据库的注释来评估所提出的方法。结果表明,与马尔可夫链或简化状态空间HMM方法相比,优化后的HMM的折叠分类准确率要高得多。所提出的方法在折叠分类上达到了41.4%的准确率,而用于比较的序列比对与建模(SAM)的准确率为38%。