Suppr超能文献

利用隐马尔可夫模型预测蛋白质结构域间连接区域

Prediction of protein interdomain linker regions by a hidden Markov model.

作者信息

Bae Kyounghwa, Mallick Bani K, Elsik Christine G

机构信息

Department of Statistics, Texas A&M University College Station, TX 77843-3143, USA.

出版信息

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

Abstract

MOTIVATION

Our aim was to predict protein interdomain linker regions using sequence alone, without requiring known homology. Identifying linker regions will delineate domain boundaries, and can be used to computationally dissect proteins into domains prior to clustering them into families. We developed a hidden Markov model of linker/non-linker sequence regions using a linker index derived from amino acid propensity. We employed an efficient Bayesian estimation of the model using Markov Chain Monte Carlo, Gibbs sampling in particular, to simulate parameters from the posteriors. Our model recognizes sequence data to be continuous rather than categorical, and generates a probabilistic output.

RESULTS

We applied our method to a dataset of protein sequences in which domains and interdomain linkers had been delineated using the Pfam-A database. The prediction results are superior to a simpler method that also uses linker index.

摘要

动机

我们的目标是仅使用序列来预测蛋白质结构域间的连接区域,而无需已知的同源性。识别连接区域将划定结构域边界,并且可用于在将蛋白质聚类成家族之前,通过计算将蛋白质分解为结构域。我们使用源自氨基酸倾向的连接指数,开发了一种连接子/非连接子序列区域的隐马尔可夫模型。我们采用马尔可夫链蒙特卡罗方法,特别是吉布斯采样,对模型进行有效的贝叶斯估计,以从后验中模拟参数。我们的模型将序列数据识别为连续的而非分类的,并生成概率输出。

结果

我们将我们的方法应用于一个蛋白质序列数据集,其中使用Pfam-A数据库划定了结构域和结构域间连接子。预测结果优于一种同样使用连接指数的更简单方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验