使用隐马尔可夫模型-隐马尔可夫模型比对和动态规划进行蛋白质折叠识别。

Protein fold recognition using HMM-HMM alignment and dynamic programming.

作者信息

Lyons James, Paliwal Kuldip K, Dehzangi Abdollah, Heffernan Rhys, Tsunoda Tatsuhiko, Sharma Alok

机构信息

School of Engineering, Griffith University, Brisbane, QLD 4111, Australia.

University of Iowa, USA.

出版信息

J Theor Biol. 2016 Mar 21;393:67-74. doi: 10.1016/j.jtbi.2015.12.018. Epub 2016 Jan 19.

DOI:10.1016/j.jtbi.2015.12.018

PMID:26801876

Abstract

Detecting three dimensional structures of protein sequences is a challenging task in biological sciences. For this purpose, protein fold recognition has been utilized as an intermediate step which helps in classifying a novel protein sequence into one of its folds. The process of protein fold recognition encompasses feature extraction of protein sequences and feature identification through suitable classifiers. Several feature extractors are developed to retrieve useful information from protein sequences. These features are generally extracted by constituting protein's sequential, physicochemical and evolutionary properties. The performance in terms of recognition accuracy has also been gradually improved over the last decade. However, it is yet to reach a well reasonable and accepted level. In this work, we first applied HMM-HMM alignment of protein sequence from HHblits to extract profile HMM (PHMM) matrix. Then we computed the distance between respective PHMM matrices using kernalized dynamic programming. We have recorded significant improvement in fold recognition over the state-of-the-art feature extractors. The improvement of recognition accuracy is in the range of 2.7-11.6% when experimented on three benchmark datasets from Structural Classification of Proteins.

摘要

检测蛋白质序列的三维结构是生物科学中的一项具有挑战性的任务。为此，蛋白质折叠识别已被用作中间步骤，有助于将新的蛋白质序列分类到其折叠类型之一中。蛋白质折叠识别过程包括蛋白质序列的特征提取和通过合适的分类器进行特征识别。已经开发了几种特征提取器来从蛋白质序列中检索有用信息。这些特征通常通过构建蛋白质的序列、物理化学和进化特性来提取。在过去十年中，识别准确率方面的性能也在逐步提高。然而，它尚未达到一个合理且被广泛接受的水平。在这项工作中，我们首先应用来自HHblits的蛋白质序列的HMM - HMM比对来提取轮廓HMM（PHMM）矩阵。然后我们使用核动态规划计算各个PHMM矩阵之间的距离。我们记录到与最先进的特征提取器相比，折叠识别有显著改进。在来自蛋白质结构分类的三个基准数据集上进行实验时，识别准确率的提高范围在2.7 - 11.6%之间。

相似文献

Protein fold recognition using HMM-HMM alignment and dynamic programming.

J Theor Biol. 2016 Mar 21;393:67-74. doi: 10.1016/j.jtbi.2015.12.018. Epub 2016 Jan 19.

Protein fold recognition by alignment of amino acid residues using kernelized dynamic time warping.

J Theor Biol. 2014 Aug 7;354:137-45. doi: 10.1016/j.jtbi.2014.03.033. Epub 2014 Mar 31.

Advancing the Accuracy of Protein Fold Recognition by Utilizing Profiles From Hidden Markov Models.

IEEE Trans Nanobioscience. 2015 Oct;14(7):761-72. doi: 10.1109/TNB.2015.2457906. Epub 2015 Jul 20.

Improving protein fold recognition and structural class prediction accuracies using physicochemical properties of amino acids.

J Theor Biol. 2016 Aug 7;402:117-28. doi: 10.1016/j.jtbi.2016.05.002. Epub 2016 May 7.

A tri-gram based feature extraction technique using linear probabilities of position specific scoring matrix for protein fold recognition.

IEEE Trans Nanobioscience. 2014 Mar;13(1):44-50. doi: 10.1109/TNB.2013.2296050.

HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.

BMC Bioinformatics. 2007 Mar 27;8:104. doi: 10.1186/1471-2105-8-104.

MRFalign: protein homology detection through alignment of Markov random fields.

PLoS Comput Biol. 2014 Mar 27;10(3):e1003500. doi: 10.1371/journal.pcbi.1003500. eCollection 2014 Mar.

A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.

J Mol Biol. 1997 Apr 11;267(4):1026-38. doi: 10.1006/jmbi.1997.0924.

HMMs in Protein Fold Classification.

Methods Mol Biol. 2017;1552:13-27. doi: 10.1007/978-1-4939-6753-7_2.

DPANN: improved sequence to structure alignments following fold recognition.

Proteins. 2004 Aug 15;56(3):528-38. doi: 10.1002/prot.20144.

引用本文的文献

Biological Sequence Classification: A Review on Data and General Methods.

Research (Wash D C). 2022 Dec 19;2022:0011. doi: 10.34133/research.0011. eCollection 2022.

Profiles of Natural and Designed Protein-Like Sequences Effectively Bridge Protein Sequence Gaps: Implications in Distant Homology Detection.

Methods Mol Biol. 2022;2449:149-167. doi: 10.1007/978-1-0716-2095-3_5.

BioS2Net: Holistic Structural and Sequential Analysis of Biomolecules Using a Deep Neural Network.

Int J Mol Sci. 2022 Mar 9;23(6):2966. doi: 10.3390/ijms23062966.

Accurate Identification of Antioxidant Proteins Based on a Combination of Machine Learning Techniques and Hidden Markov Model Profiles.

Comput Math Methods Med. 2021 Aug 7;2021:5770981. doi: 10.1155/2021/5770981. eCollection 2021.

A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction.

PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. eCollection 2021.

DeepFrag-k: a fragment-based deep learning approach for protein fold recognition.

BMC Bioinformatics. 2020 Nov 18;21(Suppl 6):203. doi: 10.1186/s12859-020-3504-z.

PredDBP-Stack: Prediction of DNA-Binding Proteins from HMM Profiles using a Stacked Ensemble Method.

Biomed Res Int. 2020 Apr 13;2020:7297631. doi: 10.1155/2020/7297631. eCollection 2020.

HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues.

BMC Genomics. 2019 Apr 18;19(Suppl 9):982. doi: 10.1186/s12864-018-5206-8.

SumSec: Accurate Prediction of Sumoylation Sites Using Predicted Secondary Structure.

Molecules. 2018 Dec 10;23(12):3260. doi: 10.3390/molecules23123260.

Success: evolutionary and structural properties of amino acids prove effective for succinylation site prediction.

BMC Genomics. 2018 Jan 19;19(Suppl 1):923. doi: 10.1186/s12864-017-4336-8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用隐马尔可夫模型-隐马尔可夫模型比对和动态规划进行蛋白质折叠识别。

Protein fold recognition using HMM-HMM alignment and dynamic programming.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献