• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Meta-MEME:基于模体的蛋白质家族隐马尔可夫模型

Meta-MEME: motif-based hidden Markov models of protein families.

作者信息

Grundy W N, Bailey T L, Elkan C P, Baker M E

机构信息

Department of Computer Science and Engineering, University of California, San Diego, La Jolla 92093, USA.

出版信息

Comput Appl Biosci. 1997 Aug;13(4):397-406. doi: 10.1093/bioinformatics/13.4.397.

DOI:10.1093/bioinformatics/13.4.397
PMID:9283754
Abstract

MOTIVATION

Modeling families of related biological sequences using Hidden Markov models (HMMs), although increasingly widespread, faces at least one major problem: because of the complexity of these mathematical models, they require a relatively large training set in order to accurately recognize a given family. For families in which there are few known sequences, a standard linear HMM contains too many parameters to be trained adequately.

RESULTS

This work attempts to solve that problem by generating smaller HMMs which precisely model only the conserved regions of the family. These HMMs are constructed from motif models generated by the EM algorithm using the MEME software. Because motif-based HMMs have relatively few parameters, they can be trained using smaller data sets. Studies of short chain alcohol dehydrogenases and 4Fe-4S ferredoxins support the claim that motif-based HMMs exhibit increased sensitivity and selectivity in database searches, especially when training sets contain few sequences.

摘要

动机

使用隐马尔可夫模型(HMM)对相关生物序列家族进行建模,尽管越来越普遍,但至少面临一个主要问题:由于这些数学模型的复杂性,它们需要相对较大的训练集才能准确识别给定的家族。对于已知序列较少的家族,标准的线性HMM包含过多参数,无法得到充分训练。

结果

这项工作试图通过生成更小的HMM来解决该问题,这些HMM仅精确模拟家族的保守区域。这些HMM由使用MEME软件通过EM算法生成的基序模型构建而成。由于基于基序的HMM参数相对较少,因此可以使用较小的数据集进行训练。对短链醇脱氢酶和4Fe-4S铁氧化还原蛋白的研究支持了以下观点:基于基序的HMM在数据库搜索中表现出更高的灵敏度和选择性,尤其是当训练集包含的序列较少时。

相似文献

1
Meta-MEME: motif-based hidden Markov models of protein families.Meta-MEME:基于模体的蛋白质家族隐马尔可夫模型
Comput Appl Biosci. 1997 Aug;13(4):397-406. doi: 10.1093/bioinformatics/13.4.397.
2
Hidden Markov models for sequence analysis: extension and analysis of the basic method.用于序列分析的隐马尔可夫模型:基本方法的扩展与分析
Comput Appl Biosci. 1996 Apr;12(2):95-107. doi: 10.1093/bioinformatics/12.2.95.
3
The effects of ordered-series-of-motifs anchoring and sub-class modeling on the generation of HMMs representing highly divergent protein sequences.有序基序系列锚定和子类建模对表示高度分化蛋白质序列的隐马尔可夫模型生成的影响。
Pac Symp Biocomput. 1999:162-70. doi: 10.1142/9789814447300_0016.
4
Simultaneous sequence alignment and tree construction using hidden Markov models.使用隐马尔可夫模型进行同步序列比对和树构建。
Pac Symp Biocomput. 2003:180-91.
5
Hidden Markov model analysis of motifs in steroid dehydrogenases and their homologs.类固醇脱氢酶及其同源物中基序的隐马尔可夫模型分析
Biochem Biophys Res Commun. 1997 Feb 24;231(3):760-6. doi: 10.1006/bbrc.1997.6193.
6
Hidden Markov models of biological primary sequence information.生物一级序列信息的隐马尔可夫模型
Proc Natl Acad Sci U S A. 1994 Feb 1;91(3):1059-63. doi: 10.1073/pnas.91.3.1059.
7
Hidden Markov models in computational biology. Applications to protein modeling.计算生物学中的隐马尔可夫模型。在蛋白质建模中的应用。
J Mol Biol. 1994 Feb 4;235(5):1501-31. doi: 10.1006/jmbi.1994.1104.
8
Using Dirichlet mixture priors to derive hidden Markov models for protein families.使用狄利克雷混合先验来推导蛋白质家族的隐马尔可夫模型。
Proc Int Conf Intell Syst Mol Biol. 1993;1:47-55.
9
Substitution matrices and hidden Markov models.替换矩阵与隐马尔可夫模型。
J Comput Biol. 1995 Fall;2(3):487-91. doi: 10.1089/cmb.1995.2.487.
10
Homology detection via family pairwise search.通过家族成对搜索进行同源性检测。
J Comput Biol. 1998 Fall;5(3):479-91. doi: 10.1089/cmb.1998.5.479.

引用本文的文献

1
Genome-wide identification and expression analyses of phenylalanine ammonia-lyase gene family members from tomato () reveal their role in root-knot nematode infection.番茄苯丙氨酸解氨酶基因家族成员的全基因组鉴定与表达分析揭示了它们在根结线虫感染中的作用。
Front Plant Sci. 2023 Jun 6;14:1204990. doi: 10.3389/fpls.2023.1204990. eCollection 2023.
2
Genome-wide identification and analysis of the evolution and expression pattern of the gene family in three wild species of tomatoes.在三个野生番茄品种中进行基因家族的全基因组鉴定和进化及表达模式分析。
PeerJ. 2023 Feb 13;11:e14844. doi: 10.7717/peerj.14844. eCollection 2023.
3
Pneumococcal capsule expression is controlled through a conserved, distal cis-regulatory element during infection.
肺炎球菌荚膜表达受感染过程中保守的、远端顺式调控元件的控制。
PLoS Pathog. 2023 Jan 31;19(1):e1011035. doi: 10.1371/journal.ppat.1011035. eCollection 2023 Jan.
4
Analysis of Protein Sequence Identity, Binding Sites, and 3D Structures Identifies Eight Pollen Species and Ten Fruit Species with High Risk of Cross-Reactive Allergies.分析蛋白质序列同一性、结合位点和 3D 结构可鉴定八种花粉物种和十种果实物种具有高度交叉反应性过敏的风险。
Genes (Basel). 2022 Aug 17;13(8):1464. doi: 10.3390/genes13081464.
5
Genome-Wide Identification and Evolutionary Analysis of the SRO Gene Family in Tomato.番茄中SRO基因家族的全基因组鉴定与进化分析
Front Genet. 2021 Sep 21;12:753638. doi: 10.3389/fgene.2021.753638. eCollection 2021.
6
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture.基于共享混合深度学习架构,利用DNA形状特征预测转录因子结合位点。
Mol Ther Nucleic Acids. 2021 Feb 18;24:154-163. doi: 10.1016/j.omtn.2021.02.014. eCollection 2021 Jun 4.
7
An effective approach for annotation of protein families with low sequence similarity and conserved motifs: identifying GDSL hydrolases across the plant kingdom.一种用于注释低序列相似性和保守基序的蛋白质家族的有效方法:鉴定植物界中的GDSL水解酶。
BMC Bioinformatics. 2016 Feb 18;17:91. doi: 10.1186/s12859-016-0919-7.
8
Horizontal functional gene transfer from bacteria to fishes.从细菌到鱼类的水平功能基因转移。
Sci Rep. 2015 Dec 22;5:18676. doi: 10.1038/srep18676.
9
Biodefense Oriented Genomic-Based Pathogen Classification Systems: Challenges and Opportunities.面向生物防御的基于基因组的病原体分类系统:挑战与机遇
J Bioterror Biodef. 2012 Mar 16;3(1):1000113. doi: 10.4172/2157-2526.1000113.
10
In vitro selection of RNA aptamers directed against protein E: a Haemophilus influenzae adhesin.针对蛋白E(一种流感嗜血杆菌粘附素)的RNA适体的体外筛选
Mol Biotechnol. 2014 Aug;56(8):714-25. doi: 10.1007/s12033-014-9749-x.