• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过家族成对搜索进行同源性检测。

Homology detection via family pairwise search.

作者信息

Grundy W N

机构信息

Department of Computer Science and Engineering, University of California, San Diego, La Jolla 92093-0114, USA.

出版信息

J Comput Biol. 1998 Fall;5(3):479-91. doi: 10.1089/cmb.1998.5.479.

DOI:10.1089/cmb.1998.5.479
PMID:9773344
Abstract

The function of an unknown biological sequence can often be accurately inferred by identifying sequences homologous to the original sequence. Given a query set of known homologs, there exist at least three general classes of techniques for finding additional homologs: pairwise sequence comparisons, motif analysis, and hidden Markov modeling. Pairwise sequence comparisons are typically employed when only a single query sequence is known. Hidden Markov models (HMMs), on the other hand, are usually trained with sets of more than 100 sequences. Motif-based methods fall in between these two extremes. The current work introduces a straightforward generalization of pairwise sequence comparison algorithms to the case when multiple query sequences are available. This algorithm, called Family Pairwise Search (FPS), combines pairwise sequence comparison scores from each query sequence. A BLAST implementation of FPS is compared to representative examples of hidden Markov modeling (HMMER) and motif modeling (MEME). The three techniques are compared across a wide range of protein families, using query sets of varying sizes. BLAST FPS significantly outperforms motif-based and HMM methods. Furthermore, FPS is much more efficient than the training algorithms for statistical models.

摘要

通常可以通过识别与原始序列同源的序列,准确推断出未知生物序列的功能。给定一组已知的同源序列查询集,至少存在三类用于寻找其他同源序列的通用技术:成对序列比较、基序分析和隐马尔可夫建模。当仅知道单个查询序列时,通常采用成对序列比较。另一方面,隐马尔可夫模型(HMM)通常使用100多个序列进行训练。基于基序的方法介于这两个极端之间。当前的工作将成对序列比较算法直接推广到有多条查询序列可用的情况。这种算法称为家族成对搜索(FPS),它结合了每个查询序列的成对序列比较得分。将FPS的BLAST实现与隐马尔可夫建模(HMMER)和基序建模(MEME)的代表性示例进行比较。使用不同大小的查询集,在广泛的蛋白质家族中对这三种技术进行比较。BLAST FPS明显优于基于基序的方法和HMM方法。此外,FPS比统计模型的训练算法效率高得多。

相似文献

1
Homology detection via family pairwise search.通过家族成对搜索进行同源性检测。
J Comput Biol. 1998 Fall;5(3):479-91. doi: 10.1089/cmb.1998.5.479.
2
Family pairwise search with embedded motif models.使用嵌入式基序模型进行家族成对搜索。
Bioinformatics. 1999 Jun;15(6):463-70. doi: 10.1093/bioinformatics/15.6.463.
3
Hidden Markov models in computational biology. Applications to protein modeling.计算生物学中的隐马尔可夫模型。在蛋白质建模中的应用。
J Mol Biol. 1994 Feb 4;235(5):1501-31. doi: 10.1006/jmbi.1994.1104.
4
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
5
Meta-MEME: motif-based hidden Markov models of protein families.Meta-MEME:基于模体的蛋白质家族隐马尔可夫模型
Comput Appl Biosci. 1997 Aug;13(4):397-406. doi: 10.1093/bioinformatics/13.4.397.
6
Protein homology detection by HMM-HMM comparison.通过隐马尔可夫模型(HMM)比较进行蛋白质同源性检测。
Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5.
7
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods.使用多序列进行的序列比较所检测到的远源同源物数量是成对方法的三倍。
J Mol Biol. 1998 Dec 11;284(4):1201-10. doi: 10.1006/jmbi.1998.2221.
8
Detecting distant homologs using phylogenetic tree-based HMMs.使用基于系统发育树的隐马尔可夫模型检测远缘同源物。
Proteins. 2003 Aug 15;52(3):446-53. doi: 10.1002/prot.10373.
9
Comparing protein sequence-based and predicted secondary structure-based methods for identification of remote homologs.比较基于蛋白质序列和基于预测二级结构的方法以鉴定远缘同源物。
Protein Eng. 1999 Jul;12(7):527-34. doi: 10.1093/protein/12.7.527.
10
Hidden Markov model analysis of motifs in steroid dehydrogenases and their homologs.类固醇脱氢酶及其同源物中基序的隐马尔可夫模型分析
Biochem Biophys Res Commun. 1997 Feb 24;231(3):760-6. doi: 10.1006/bbrc.1997.6193.

引用本文的文献

1
Globally distributed bacteriophage genomes reveal mechanisms of tripartite phage-bacteria-coral interactions.全球分布的噬菌体基因组揭示了噬菌体 - 细菌 - 珊瑚三方相互作用的机制。
ISME J. 2024 Jan 8;18(1). doi: 10.1093/ismejo/wrae132.
2
Constructing benchmark test sets for biological sequence analysis using independent set algorithms.使用独立集算法构建生物序列分析的基准测试集。
PLoS Comput Biol. 2022 Mar 7;18(3):e1009492. doi: 10.1371/journal.pcbi.1009492. eCollection 2022 Mar.
3
Defining the Domain Arrangement of the Mammalian Target of Rapamycin Complex Component Rictor Protein.
确定哺乳动物雷帕霉素靶蛋白复合物组分rictor蛋白的结构域排列
J Comput Biol. 2015 Sep;22(9):876-86. doi: 10.1089/cmb.2015.0103. Epub 2015 Jul 15.
4
Biodefense Oriented Genomic-Based Pathogen Classification Systems: Challenges and Opportunities.面向生物防御的基于基因组的病原体分类系统:挑战与机遇
J Bioterror Biodef. 2012 Mar 16;3(1):1000113. doi: 10.4172/2157-2526.1000113.
5
Infernal 1.1: 100-fold faster RNA homology searches. Infernal 1.1:100 倍更快的 RNA 同源性搜索。
Bioinformatics. 2013 Nov 15;29(22):2933-5. doi: 10.1093/bioinformatics/btt509. Epub 2013 Sep 4.
6
Computational identification of functional RNA homologs in metagenomic data.计算鉴定宏基因组数据中的功能 RNA 同源物。
RNA Biol. 2013 Jul;10(7):1170-9. doi: 10.4161/rna.25038. Epub 2013 May 20.
7
Increasing sequence search sensitivity with transitive alignments.利用传递比对提高序列搜索灵敏度。
PLoS One. 2013;8(2):e54422. doi: 10.1371/journal.pone.0054422. Epub 2013 Feb 14.
8
Accelerated Profile HMM Searches.加速轮廓隐马尔可夫模型搜索。
PLoS Comput Biol. 2011 Oct;7(10):e1002195. doi: 10.1371/journal.pcbi.1002195. Epub 2011 Oct 20.
9
Phamerator: a bioinformatic tool for comparative bacteriophage genomics.Phamerator:一种用于比较噬菌体基因组学的生物信息学工具。
BMC Bioinformatics. 2011 Oct 12;12:395. doi: 10.1186/1471-2105-12-395.
10
Hidden Markov model speed heuristic and iterative HMM search procedure.隐马尔可夫模型速度启发式和迭代隐马尔可夫模型搜索过程。
BMC Bioinformatics. 2010 Aug 18;11:431. doi: 10.1186/1471-2105-11-431.