Suppr超能文献

通过家族成对搜索进行同源性检测。

Homology detection via family pairwise search.

作者信息

Grundy W N

机构信息

Department of Computer Science and Engineering, University of California, San Diego, La Jolla 92093-0114, USA.

出版信息

J Comput Biol. 1998 Fall;5(3):479-91. doi: 10.1089/cmb.1998.5.479.

Abstract

The function of an unknown biological sequence can often be accurately inferred by identifying sequences homologous to the original sequence. Given a query set of known homologs, there exist at least three general classes of techniques for finding additional homologs: pairwise sequence comparisons, motif analysis, and hidden Markov modeling. Pairwise sequence comparisons are typically employed when only a single query sequence is known. Hidden Markov models (HMMs), on the other hand, are usually trained with sets of more than 100 sequences. Motif-based methods fall in between these two extremes. The current work introduces a straightforward generalization of pairwise sequence comparison algorithms to the case when multiple query sequences are available. This algorithm, called Family Pairwise Search (FPS), combines pairwise sequence comparison scores from each query sequence. A BLAST implementation of FPS is compared to representative examples of hidden Markov modeling (HMMER) and motif modeling (MEME). The three techniques are compared across a wide range of protein families, using query sets of varying sizes. BLAST FPS significantly outperforms motif-based and HMM methods. Furthermore, FPS is much more efficient than the training algorithms for statistical models.

摘要

通常可以通过识别与原始序列同源的序列,准确推断出未知生物序列的功能。给定一组已知的同源序列查询集,至少存在三类用于寻找其他同源序列的通用技术:成对序列比较、基序分析和隐马尔可夫建模。当仅知道单个查询序列时,通常采用成对序列比较。另一方面,隐马尔可夫模型(HMM)通常使用100多个序列进行训练。基于基序的方法介于这两个极端之间。当前的工作将成对序列比较算法直接推广到有多条查询序列可用的情况。这种算法称为家族成对搜索(FPS),它结合了每个查询序列的成对序列比较得分。将FPS的BLAST实现与隐马尔可夫建模(HMMER)和基序建模(MEME)的代表性示例进行比较。使用不同大小的查询集,在广泛的蛋白质家族中对这三种技术进行比较。BLAST FPS明显优于基于基序的方法和HMM方法。此外,FPS比统计模型的训练算法效率高得多。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验