Suppr超能文献

轮廓隐马尔可夫模型中发射概率的有效估计

Efficient estimation of emission probabilities in profile hidden Markov models.

作者信息

Ahola Virpi, Aittokallio Tero, Uusipaikka Esa, Vihinen Mauno

机构信息

Department of Statistics, FIN-20014 University of Turku, Finland.

出版信息

Bioinformatics. 2003 Dec 12;19(18):2359-68. doi: 10.1093/bioinformatics/btg328.

Abstract

MOTIVATION

Profile hidden Markov models provide a sensitive method for performing sequence database search and aligning multiple sequences. One of the drawbacks of the hidden Markov model is that the conserved amino acids are not emphasized, but signal and noise are treated equally. For this reason, the number of estimated emission parameters is often enormous. Focusing the analysis on conserved residues only should increase the accuracy of sequence database search.

RESULTS

We address this issue with a new method for efficient emission probability (EEP) estimation, in which amino acids are divided into effective and ineffective residues at each conserved alignment position. A practical study with 20 protein families demonstrated that the EEP method is capable of detecting family members from other proteins with sensitivity of 98% and specificity of 99% on the average, even if the number of free emission parameters was decreased to 15% of the original. In the database search for TIM barrel sequences, EEP recognizes the family members nearly as accurately as HMMER or Blast, but the number of false positive sequences was significantly less than that obtained with the other methods.

AVAILABILITY

The algorithms written in C language are available on request from the authors.

摘要

动机

轮廓隐马尔可夫模型为进行序列数据库搜索和比对多个序列提供了一种灵敏的方法。隐马尔可夫模型的缺点之一是未强调保守氨基酸,而是对信号和噪声同等对待。因此,估计的发射参数数量常常巨大。仅将分析聚焦于保守残基应能提高序列数据库搜索的准确性。

结果

我们用一种新的有效发射概率(EEP)估计方法解决了这个问题,该方法在每个保守比对位置将氨基酸分为有效和无效残基。对20个蛋白质家族的实际研究表明,EEP方法平均能够以98%的灵敏度和99%的特异性从其他蛋白质中检测出家族成员,即便自由发射参数的数量降至原来的15%。在对TIM桶状序列进行数据库搜索时,EEP识别家族成员的准确性几乎与HMMER或Blast相当,但假阳性序列的数量明显少于其他方法所得结果。

可用性

用C语言编写的算法可应作者要求提供。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验