利用隐马尔可夫模型预测蛋白质结构域间连接区域

Prediction of protein interdomain linker regions by a hidden Markov model.

作者信息

Bae Kyounghwa, Mallick Bani K, Elsik Christine G

机构信息

Department of Statistics, Texas A&M University College Station, TX 77843-3143, USA.

出版信息

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

DOI:10.1093/bioinformatics/bti363

PMID:15746283

Abstract

MOTIVATION

Our aim was to predict protein interdomain linker regions using sequence alone, without requiring known homology. Identifying linker regions will delineate domain boundaries, and can be used to computationally dissect proteins into domains prior to clustering them into families. We developed a hidden Markov model of linker/non-linker sequence regions using a linker index derived from amino acid propensity. We employed an efficient Bayesian estimation of the model using Markov Chain Monte Carlo, Gibbs sampling in particular, to simulate parameters from the posteriors. Our model recognizes sequence data to be continuous rather than categorical, and generates a probabilistic output.

RESULTS

We applied our method to a dataset of protein sequences in which domains and interdomain linkers had been delineated using the Pfam-A database. The prediction results are superior to a simpler method that also uses linker index.

摘要

动机

我们的目标是仅使用序列来预测蛋白质结构域间的连接区域，而无需已知的同源性。识别连接区域将划定结构域边界，并且可用于在将蛋白质聚类成家族之前，通过计算将蛋白质分解为结构域。我们使用源自氨基酸倾向的连接指数，开发了一种连接子/非连接子序列区域的隐马尔可夫模型。我们采用马尔可夫链蒙特卡罗方法，特别是吉布斯采样，对模型进行有效的贝叶斯估计，以从后验中模拟参数。我们的模型将序列数据识别为连续的而非分类的，并生成概率输出。

结果

我们将我们的方法应用于一个蛋白质序列数据集，其中使用Pfam-A数据库划定了结构域和结构域间连接子。预测结果优于一种同样使用连接指数的更简单方法。

相似文献

Prediction of protein interdomain linker regions by a hidden Markov model.利用隐马尔可夫模型预测蛋白质结构域间连接区域

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

Protein secondary structure: entropy, correlations and prediction.蛋白质二级结构：熵、相关性与预测

Bioinformatics. 2004 Jul 10;20(10):1603-11. doi: 10.1093/bioinformatics/bth132. Epub 2004 Feb 26.

Modelling interaction sites in protein domains with interaction profile hidden Markov models.使用相互作用谱隐马尔可夫模型对蛋白质结构域中的相互作用位点进行建模。

Bioinformatics. 2006 Dec 1;22(23):2851-7. doi: 10.1093/bioinformatics/btl486. Epub 2006 Sep 25.

FSSA: a novel method for identifying functional signatures from structural alignments.FSSA：一种从结构比对中识别功能特征的新方法。

Bioinformatics. 2005 Jul 1;21(13):2969-77. doi: 10.1093/bioinformatics/bti471. Epub 2005 Apr 28.

Quasi-consensus-based comparison of profile hidden Markov models for protein sequences.基于准共识的蛋白质序列轮廓隐马尔可夫模型比较

Bioinformatics. 2005 May 15;21(10):2287-93. doi: 10.1093/bioinformatics/bti374. Epub 2005 Mar 29.

Identifying sequence regions undergoing conformational change via predicted continuum secondary structure.通过预测的连续二级结构识别经历构象变化的序列区域。

Bioinformatics. 2006 Aug 1;22(15):1809-14. doi: 10.1093/bioinformatics/btl198. Epub 2006 May 23.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction.蛋白质结构比对在用于结构预测的迭代隐马尔可夫模型协议中的应用。

BMC Bioinformatics. 2006 Sep 14;7:410. doi: 10.1186/1471-2105-7-410.

Learning generative models for protein fold families.学习蛋白质折叠家族的生成模型。

Proteins. 2011 Apr;79(4):1061-78. doi: 10.1002/prot.22934. Epub 2011 Jan 25.

A structure and evolution-guided Monte Carlo sequence selection strategy for multiple alignment-based analysis of proteins.一种用于基于多序列比对的蛋白质分析的结构与进化引导的蒙特卡洛序列选择策略。

Bioinformatics. 2006 Jan 15;22(2):149-56. doi: 10.1093/bioinformatics/bti791. Epub 2005 Nov 22.

CHORAL: a differential geometry approach to the prediction of the cores of protein structures.CHORAL：一种用于预测蛋白质结构核心的微分几何方法。

Bioinformatics. 2005 Oct 1;21(19):3719-25. doi: 10.1093/bioinformatics/bti595. Epub 2005 Jul 26.

引用本文的文献

Dissecting the Determinants of Domain Insertion Tolerance and Allostery in Proteins.解析蛋白质结构域插入容忍度和变构的决定因素。

Adv Sci (Weinh). 2023 Oct;10(28):e2303496. doi: 10.1002/advs.202303496. Epub 2023 Aug 10.

Inter-Modular Linkers play a crucial role in governing the biosynthesis of non-ribosomal peptides.模块间连接子在调控非核糖体肽的生物合成中起着至关重要的作用。

Bioinformatics. 2019 Oct 1;35(19):3584-3591. doi: 10.1093/bioinformatics/btz127.

Structural properties of the linkers connecting the N- and C- terminal domains in the MocR bacterial transcriptional regulators.MocR细菌转录调节因子中连接N端和C端结构域的接头的结构特性。

Biochim Open. 2016 Jul 20;3:8-18. doi: 10.1016/j.biopen.2016.07.002. eCollection 2016 Dec.

PDP-CON: prediction of domain/linker residues in protein sequences using a consensus approach.PDP-CON：使用共识方法预测蛋白质序列中的结构域/连接子残基。

J Mol Model. 2016 Apr;22(4):72. doi: 10.1007/s00894-016-2933-0. Epub 2016 Mar 11.

Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins.将重要的氨基酸对纳入其中，以鉴定跨膜蛋白和非跨膜蛋白上的 O 链接糖基化位点。

BMC Bioinformatics. 2010 Oct 29;11:536. doi: 10.1186/1471-2105-11-536.

Identifying foldable regions in protein sequence from the hydrophobic signal.从疏水信号中识别蛋白质序列中的可折叠区域。

Nucleic Acids Res. 2008 Feb;36(2):578-88. doi: 10.1093/nar/gkm1070. Epub 2007 Dec 1.

Improving model construction of profile HMMs for remote homology detection through structural alignment.通过结构比对改进用于远程同源性检测的轮廓隐马尔可夫模型的模型构建。

BMC Bioinformatics. 2007 Nov 9;8:435. doi: 10.1186/1471-2105-8-435.

Domain selection combined with improved cloning strategy for high throughput expression of higher eukaryotic proteins.结合改进的克隆策略进行结构域选择以实现高等真核生物蛋白质的高通量表达

BMC Biotechnol. 2007 Jul 30;7:45. doi: 10.1186/1472-6750-7-45.

A topological algorithm for identification of structural domains of proteins.一种用于识别蛋白质结构域的拓扑算法。

BMC Bioinformatics. 2007 Jul 3;8:237. doi: 10.1186/1471-2105-8-237.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用隐马尔可夫模型预测蛋白质结构域间连接区域

Prediction of protein interdomain linker regions by a hidden Markov model.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献