用于二级结构预测的最优隐马尔可夫模型分析。

Analysis of an optimal hidden Markov model for secondary structure prediction.

作者信息

Martin Juliette, Gibrat Jean-François, Rodolphe François

机构信息

INSERM U726, Equipe de Bioinformatique Génomique et Moléculaire Université Denis Diderot Paris 7, 2 place jussieu, 75251 Paris Cedex 05, France.

出版信息

BMC Struct Biol. 2006 Dec 13;6:25. doi: 10.1186/1472-6807-6-25.

DOI:10.1186/1472-6807-6-25

PMID:17166267

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1769381/

Abstract

BACKGROUND

Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models.

RESULTS

Our HMM is designed without prior knowledge. It is chosen within a collection of models of increasing size, using statistical and accuracy criteria. The resulting model has 36 hidden states: 15 that model alpha-helices, 12 that model coil and 9 that model beta-strands. Connections between hidden states and state emission probabilities reflect the organization of protein structures into secondary structure segments. We start by analyzing the model features and see how it offers a new vision of local structures. We then use it for secondary structure prediction. Our model appears to be very efficient on single sequences, with a Q3 score of 68.8%, more than one point above PSIPRED prediction on single sequences. A straightforward extension of the method allows the use of multiple sequence alignments, rising the Q3 score to 75.5%.

CONCLUSION

The hidden Markov model presented here achieves valuable prediction results using only a limited number of parameters. It provides an interpretable framework for protein secondary structure architecture. Furthermore, it can be used as a tool for generating protein sequences with a given secondary structure content.

摘要

背景

二级结构预测是迈向三维结构预测的有用的第一步。许多成功的二级结构预测方法使用神经网络，但不幸的是，神经网络缺乏直观的可解释性。相反，隐马尔可夫模型是具有图形可解释性的模型。此外，它们已成功应用于许多生物信息学应用中。由于它们提供了强大的统计背景并允许对模型进行解释，我们提出了一种基于隐马尔可夫模型的方法。

结果

我们的隐马尔可夫模型是在没有先验知识的情况下设计的。它是在一系列规模不断增大的模型中，根据统计和准确性标准进行选择的。最终得到的模型有36个隐藏状态：15个用于模拟α螺旋，12个用于模拟卷曲，9个用于模拟β链。隐藏状态之间的连接以及状态发射概率反映了蛋白质结构组织成二级结构片段的情况。我们首先分析模型特征，看看它如何为局部结构提供新的视角。然后我们将其用于二级结构预测。我们的模型在单序列上似乎非常有效，Q3得分达到68.8%，比单序列上的PSIPRED预测高出一个多百分点。该方法的直接扩展允许使用多序列比对，使Q3得分提高到75.5%。

结论

本文提出的隐马尔可夫模型仅使用有限数量的参数就取得了有价值的预测结果。它为蛋白质二级结构架构提供了一个可解释的框架。此外，它还可以用作生成具有给定二级结构含量的蛋白质序列的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98f2/1769381/17e7a1feb54f/1472-6807-6-25-1.jpg

相似文献

Analysis of an optimal hidden Markov model for secondary structure prediction.用于二级结构预测的最优隐马尔可夫模型分析。

BMC Struct Biol. 2006 Dec 13;6:25. doi: 10.1186/1472-6807-6-25.

HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins.HMMSTR：一种用于蛋白质局部序列-结构相关性的隐马尔可夫模型。

J Mol Biol. 2000 Aug 4;301(1):173-90. doi: 10.1006/jmbi.2000.3837.

Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method.β-桶状外膜蛋白拓扑结构预测方法的评估及一种共识预测方法

BMC Bioinformatics. 2005 Jan 12;6:7. doi: 10.1186/1471-2105-6-7.

Sequence-based protein structure prediction using a reduced state-space hidden Markov model.使用简化状态空间隐马尔可夫模型进行基于序列的蛋白质结构预测。

Comput Biol Med. 2007 Sep;37(9):1211-24. doi: 10.1016/j.compbiomed.2006.10.014. Epub 2006 Dec 11.

HMMEditor: a visual editing tool for profile hidden Markov model.HMMEditor：一种用于轮廓隐马尔可夫模型的可视化编辑工具。

BMC Genomics. 2008;9 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2164-9-S1-S8.

Incorporating global information into secondary structure prediction with hidden Markov models of protein folds.利用蛋白质折叠的隐马尔可夫模型将全局信息纳入二级结构预测。

Proc Int Conf Intell Syst Mol Biol. 1997;5:100-3.

Combined prediction of transmembrane topology and signal peptide of beta-barrel proteins: using a hidden Markov model and genetic algorithms.β-桶状蛋白跨膜拓扑结构和信号肽的联合预测：使用隐马尔可夫模型和遗传算法。

Comput Biol Med. 2010 Jul;40(7):621-8. doi: 10.1016/j.compbiomed.2010.04.006. Epub 2010 May 21.

A simple and fast secondary structure prediction method using hidden neural networks.一种使用隐藏神经网络的简单快速二级结构预测方法。

Bioinformatics. 2005 Jan 15;21(2):152-9. doi: 10.1093/bioinformatics/bth487. Epub 2004 Sep 17.

Prediction of protein binding sites in protein structures using hidden Markov support vector machine.利用隐马尔可夫支持向量机预测蛋白质结构中的蛋白质结合位点。

BMC Bioinformatics. 2009 Nov 20;10:381. doi: 10.1186/1471-2105-10-381.

A probabilistic model for secondary structure prediction from protein chemical shifts.基于蛋白质化学位移的二级结构预测的概率模型。

Proteins. 2013 Jun;81(6):984-93. doi: 10.1002/prot.24249. Epub 2013 Feb 27.

引用本文的文献

MHTAPred-SS: A Highly Targeted Autoencoder-Driven Deep Multi-Task Learning Framework for Accurate Protein Secondary Structure Prediction.MHTAPred-SS：一种用于准确蛋白质二级结构预测的高度靶向的自动编码器驱动的深度多任务学习框架。

Int J Mol Sci. 2024 Dec 15;25(24):13444. doi: 10.3390/ijms252413444.

Complementarity of the residue-level protein function and structure predictions in human proteins.人类蛋白质中残基水平的蛋白质功能与结构预测的互补性。

Comput Struct Biotechnol J. 2022 May 6;20:2223-2234. doi: 10.1016/j.csbj.2022.05.003. eCollection 2022.

Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach.使用小型训练集（紧凑模型）结合复值神经网络方法进行蛋白质二级结构预测。

BMC Bioinformatics. 2016 Sep 13;17(1):362. doi: 10.1186/s12859-016-1209-0.

SPINE X: improving protein secondary structure prediction by multistep learning coupled with prediction of solvent accessible surface area and backbone torsion angles.SPINE X：通过多步骤学习与溶剂可及表面积和骨架扭转角预测相结合来改进蛋白质二级结构预测。

J Comput Chem. 2012 Jan 30;33(3):259-67. doi: 10.1002/jcc.21968. Epub 2011 Nov 2.

Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data.马尔可夫源生成的一组随机序列中模式的精确分布：在生物数据中的应用。

Algorithms Mol Biol. 2010 Jan 26;5:15. doi: 10.1186/1748-7188-5-15.

Impact of residue accessible surface area on the prediction of protein secondary structures.残基可及表面积对蛋白质二级结构预测的影响。

BMC Bioinformatics. 2008 Aug 31;9:357. doi: 10.1186/1471-2105-9-357.

An evolutionary method for learning HMM structure: prediction of protein secondary structure.一种学习隐马尔可夫模型结构的进化方法：蛋白质二级结构预测

BMC Bioinformatics. 2007 Sep 21;8:357. doi: 10.1186/1471-2105-8-357.

本文引用的文献

Protein secondary structure prediction with semi Markov HMMs.使用半马尔可夫隐马尔可夫模型进行蛋白质二级结构预测。

Conf Proc IEEE Eng Med Biol Soc. 2004;2004:2964-7. doi: 10.1109/IEMBS.2004.1403841.

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models.使用隐藏半马尔可夫模型对单序列进行蛋白质二级结构预测。

BMC Bioinformatics. 2006 Mar 30;7:178. doi: 10.1186/1471-2105-7-178.

Free modeling with Rosetta in CASP6.在蛋白质结构预测技术关键评估第6轮（CASP6）中使用Rosetta进行自由建模。

Proteins. 2005;61 Suppl 7:128-134. doi: 10.1002/prot.20729.

Gibbs sampling and helix-cap motifs.吉布斯采样与螺旋帽基序

Nucleic Acids Res. 2005 Sep 20;33(16):5343-53. doi: 10.1093/nar/gki842. Print 2005.

Protein secondary structure assignment revisited: a detailed analysis of different assignment methods.蛋白质二级结构归属再探讨：不同归属方法的详细分析

BMC Struct Biol. 2005 Sep 15;5:17. doi: 10.1186/1472-6807-5-17.

An HMM posterior decoder for sequence feature prediction that includes homology information.一种用于序列特征预测的隐马尔可夫模型后验解码器，其包含同源性信息。

Bioinformatics. 2005 Jun;21 Suppl 1:i251-7. doi: 10.1093/bioinformatics/bti1014.

Improved protein secondary structure prediction using support vector machine with a new encoding scheme and an advanced tertiary classifier.使用具有新编码方案和先进三级分类器的支持向量机改进蛋白质二级结构预测。

IEEE Trans Nanobioscience. 2004 Dec;3(4):265-71. doi: 10.1109/tnb.2004.837906.

Clustering of amino acids for protein secondary structure prediction.用于蛋白质二级结构预测的氨基酸聚类

J Bioinform Comput Biol. 2004 Jun;2(2):333-42. doi: 10.1142/s0219720004000582.

Training HMM structure with genetic algorithm for biological sequence analysis.使用遗传算法训练隐马尔可夫模型结构用于生物序列分析。

Bioinformatics. 2004 Dec 12;20(18):3613-9. doi: 10.1093/bioinformatics/bth454. Epub 2004 Aug 5.

Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information.使用隐马尔可夫模型和进化信息可实现最佳的α-螺旋跨膜蛋白拓扑结构预测。

Protein Sci. 2004 Jul;13(7):1908-17. doi: 10.1110/ps.04625404.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于二级结构预测的最优隐马尔可夫模型分析。

Analysis of an optimal hidden Markov model for secondary structure prediction.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献