使用隐藏半马尔可夫模型对单序列进行蛋白质二级结构预测。

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models.

作者信息

Aydin Zafer, Altunbasak Yucel, Borodovsky Mark

机构信息

School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0250, USA.

出版信息

BMC Bioinformatics. 2006 Mar 30;7:178. doi: 10.1186/1471-2105-7-178.

DOI:10.1186/1471-2105-7-178

PMID:16571137

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1479840/

Abstract

BACKGROUND

The accuracy of protein secondary structure prediction has been improving steadily towards the 88% estimated theoretical limit. There are two types of prediction algorithms: Single-sequence prediction algorithms imply that information about other (homologous) proteins is not available, while algorithms of the second type imply that information about homologous proteins is available, and use it intensively. The single-sequence algorithms could make an important contribution to studies of proteins with no detected homologs, however the accuracy of protein secondary structure prediction from a single-sequence is not as high as when the additional evolutionary information is present.

RESULTS

In this paper, we further refine and extend the hidden semi-Markov model (HSMM) initially considered in the BSPSS algorithm. We introduce an improved residue dependency model by considering the patterns of statistically significant amino acid correlation at structural segment borders. We also derive models that specialize on different sections of the dependency structure and incorporate them into HSMM. In addition, we implement an iterative training method to refine estimates of HSMM parameters. The three-state-per-residue accuracy and other accuracy measures of the new method, IPSSP, are shown to be comparable or better than ones for BSPSS as well as for PSIPRED, tested under the single-sequence condition.

CONCLUSIONS

We have shown that new dependency models and training methods bring further improvements to single-sequence protein secondary structure prediction. The results are obtained under cross-validation conditions using a dataset with no pair of sequences having significant sequence similarity. As new sequences are added to the database it is possible to augment the dependency structure and obtain even higher accuracy. Current and future advances should contribute to the improvement of function prediction for orphan proteins inscrutable to current similarity search methods.

摘要

背景

蛋白质二级结构预测的准确性一直在稳步提高，朝着估计的88%的理论极限迈进。有两种类型的预测算法：单序列预测算法意味着无法获得其他（同源）蛋白质的信息，而第二类算法意味着可以获得同源蛋白质的信息，并大量使用这些信息。单序列算法可以为未检测到同源物的蛋白质研究做出重要贡献，然而，从单序列预测蛋白质二级结构的准确性不如存在额外进化信息时高。

结果

在本文中，我们进一步完善和扩展了最初在BSPSS算法中考虑的隐藏半马尔可夫模型（HSMM）。我们通过考虑结构片段边界处具有统计学意义的氨基酸相关性模式，引入了一种改进的残基依赖性模型。我们还推导了专门针对依赖性结构不同部分的模型，并将它们纳入HSMM。此外，我们实现了一种迭代训练方法来细化HSMM参数的估计。在单序列条件下进行测试时，新方法IPSSP的每残基三状态准确性和其他准确性指标显示与BSPSS以及PSIPRED相当或更好。

结论

我们已经表明，新的依赖性模型和训练方法进一步改进了单序列蛋白质二级结构预测。结果是在交叉验证条件下使用一个没有一对序列具有显著序列相似性的数据集获得的。随着新序列添加到数据库中，有可能增强依赖性结构并获得更高的准确性。当前和未来的进展应该有助于改进当前相似性搜索方法难以捉摸的孤儿蛋白质的功能预测。

相似文献

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models.

BMC Bioinformatics. 2006 Mar 30;7:178. doi: 10.1186/1471-2105-7-178.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.

BMC Bioinformatics. 2006 Sep 14;7:410. doi: 10.1186/1471-2105-7-410.

A simple and fast secondary structure prediction method using hidden neural networks.

Bioinformatics. 2005 Jan 15;21(2):152-9. doi: 10.1093/bioinformatics/bth487. Epub 2004 Sep 17.

Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

Bioinformatics. 2007 Dec 1;23(23):3147-54. doi: 10.1093/bioinformatics/btm505. Epub 2007 Oct 17.

Improved method for predicting beta-turn using support vector machine.

Bioinformatics. 2005 May 15;21(10):2370-4. doi: 10.1093/bioinformatics/bti358. Epub 2005 Mar 29.

A dynamic Bayesian network approach to protein secondary structure prediction.

BMC Bioinformatics. 2008 Jan 25;9:49. doi: 10.1186/1471-2105-9-49.

HYPLOSP: a knowledge-based approach to protein local structure prediction.

J Bioinform Comput Biol. 2006 Dec;4(6):1287-307. doi: 10.1142/s0219720006002466.

Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins.

BMC Bioinformatics. 2006 Apr 5;7:189. doi: 10.1186/1471-2105-7-189.

HYPROSP II--a knowledge-based hybrid method for protein secondary structure prediction based on local prediction confidence.

Bioinformatics. 2005 Aug 1;21(15):3227-33. doi: 10.1093/bioinformatics/bti524. Epub 2005 Jun 2.

Clustering of amino acids for protein secondary structure prediction.

J Bioinform Comput Biol. 2004 Jun;2(2):333-42. doi: 10.1142/s0219720004000582.

引用本文的文献

MHTAPred-SS: A Highly Targeted Autoencoder-Driven Deep Multi-Task Learning Framework for Accurate Protein Secondary Structure Prediction.

Int J Mol Sci. 2024 Dec 15;25(24):13444. doi: 10.3390/ijms252413444.

SERT-StructNet: Protein secondary structure prediction method based on multi-factor hybrid deep model.

Comput Struct Biotechnol J. 2024 Mar 22;23:1364-1375. doi: 10.1016/j.csbj.2024.03.018. eCollection 2024 Dec.

Deep Ensemble Learning with Atrous Spatial Pyramid Networks for Protein Secondary Structure Prediction.

Biomolecules. 2022 Jun 2;12(6):774. doi: 10.3390/biom12060774.

Engineering proteins for allosteric control by light or ligands.

Nat Protoc. 2019 Jun;14(6):1863-1883. doi: 10.1038/s41596-019-0165-3. Epub 2019 May 10.

Prediction of 8-state protein secondary structures by a novel deep learning architecture.

BMC Bioinformatics. 2018 Aug 3;19(1):293. doi: 10.1186/s12859-018-2280-5.

Protein Secondary Structure Prediction Based on Data Partition and Semi-Random Subspace Method.

Sci Rep. 2018 Jun 29;8(1):9856. doi: 10.1038/s41598-018-28084-8.

Sixty-five years of the long march in protein secondary structure prediction: the final stretch?

Brief Bioinform. 2018 May 1;19(3):482-494. doi: 10.1093/bib/bbw129.

Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods?

J Phys Chem B. 2017 Apr 20;121(15):3546-3554. doi: 10.1021/acs.jpcb.6b09517. Epub 2016 Oct 26.

Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields.

Sci Rep. 2016 Jan 11;6:18962. doi: 10.1038/srep18962.

A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction.

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):103-12. doi: 10.1109/TCBB.2014.2343960. Epub 2014 Aug 7.

本文引用的文献

Multi-class support vector machines for protein secondary structure prediction.

Genome Inform. 2003;14:218-27.

Porter: a new, accurate server for protein secondary structure prediction.

Bioinformatics. 2005 Apr 15;21(8):1719-20. doi: 10.1093/bioinformatics/bti203. Epub 2004 Dec 7.

Amino acid propensities are position-dependent throughout the length of alpha-helices.

J Mol Biol. 2004 Apr 9;337(5):1195-205. doi: 10.1016/j.jmb.2004.02.004.

A novel method for protein secondary structure prediction using dual-layer SVM and profiles.

Proteins. 2004 Mar 1;54(4):738-43. doi: 10.1002/prot.10634.

Protein secondary structure prediction based on an improved support vector machines approach.

Protein Eng. 2003 Aug;16(8):553-60. doi: 10.1093/protein/gzg072.

Secondary structure prediction with support vector machines.

Bioinformatics. 2003 Sep 1;19(13):1650-5. doi: 10.1093/bioinformatics/btg223.

Analysis of two large functionally uncharacterized regions in the Methanopyrus kandleri AV19 genome.

BMC Genomics. 2003 Apr 2;4(1):12. doi: 10.1186/1471-2164-4-12.

Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence.

Proteins. 2002 Nov 1;49(2):154-66. doi: 10.1002/prot.10181.

Prediction of human protein function from post-translational modifications and localization features.

J Mol Biol. 2002 Jun 21;319(5):1257-65. doi: 10.1016/S0022-2836(02)00379-0.

Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles.

Proteins. 2002 May 1;47(2):228-35. doi: 10.1002/prot.10082.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用隐藏半马尔可夫模型对单序列进行蛋白质二级结构预测。

Protein secondary structure prediction for a single-sequence using hidden semi-Markov models.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献