Suppr超能文献

Meta-MEME:基于模体的蛋白质家族隐马尔可夫模型

Meta-MEME: motif-based hidden Markov models of protein families.

作者信息

Grundy W N, Bailey T L, Elkan C P, Baker M E

机构信息

Department of Computer Science and Engineering, University of California, San Diego, La Jolla 92093, USA.

出版信息

Comput Appl Biosci. 1997 Aug;13(4):397-406. doi: 10.1093/bioinformatics/13.4.397.

Abstract

MOTIVATION

Modeling families of related biological sequences using Hidden Markov models (HMMs), although increasingly widespread, faces at least one major problem: because of the complexity of these mathematical models, they require a relatively large training set in order to accurately recognize a given family. For families in which there are few known sequences, a standard linear HMM contains too many parameters to be trained adequately.

RESULTS

This work attempts to solve that problem by generating smaller HMMs which precisely model only the conserved regions of the family. These HMMs are constructed from motif models generated by the EM algorithm using the MEME software. Because motif-based HMMs have relatively few parameters, they can be trained using smaller data sets. Studies of short chain alcohol dehydrogenases and 4Fe-4S ferredoxins support the claim that motif-based HMMs exhibit increased sensitivity and selectivity in database searches, especially when training sets contain few sequences.

摘要

动机

使用隐马尔可夫模型(HMM)对相关生物序列家族进行建模,尽管越来越普遍,但至少面临一个主要问题:由于这些数学模型的复杂性,它们需要相对较大的训练集才能准确识别给定的家族。对于已知序列较少的家族,标准的线性HMM包含过多参数,无法得到充分训练。

结果

这项工作试图通过生成更小的HMM来解决该问题,这些HMM仅精确模拟家族的保守区域。这些HMM由使用MEME软件通过EM算法生成的基序模型构建而成。由于基于基序的HMM参数相对较少,因此可以使用较小的数据集进行训练。对短链醇脱氢酶和4Fe-4S铁氧化还原蛋白的研究支持了以下观点:基于基序的HMM在数据库搜索中表现出更高的灵敏度和选择性,尤其是当训练集包含的序列较少时。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验