Suppr超能文献

NBLAST:一种用于NxN比较的BLAST聚类变体。

NBLAST: a cluster variant of BLAST for NxN comparisons.

作者信息

Dumontier Michel, Hogue Christopher W V

机构信息

Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8.

出版信息

BMC Bioinformatics. 2002 May 8;3:13. doi: 10.1186/1471-2105-3-13.

Abstract

BACKGROUND

The BLAST algorithm compares biological sequences to one another in order to determine shared motifs and common ancestry. However, the comparison of all non-redundant (NR) sequences against all other NR sequences is a computationally intensive task. We developed NBLAST as a cluster computer implementation of the BLAST family of sequence comparison programs for the purpose of generating pre-computed BLAST alignments and neighbour lists of NR sequences.

RESULTS

NBLAST performs the heuristic BLAST algorithm and generates an exhaustive database of alignments, but it only computes alignments (i.e. the upper triangle) of a possible N2 alignments, where N is the set of all sequences to be compared. A task-partitioning algorithm allows for cluster computing across all cluster nodes and the NBLAST master process produces a BLAST sequence alignment database and a list of sequence neighbours for each sequence record. The resulting sequence alignment and neighbour databases are used to serve the SeqHound query system through a C/C++ and PERL Application Programming Interface (API).

CONCLUSIONS

NBLAST offers a local alternative to the NCBI's remote Entrez system for pre-computed BLAST alignments and neighbour queries. On our 216-processor 450 MHz PIII cluster, NBLAST requires ~24 hrs to compute neighbours for 850000 proteins currently in the non-redundant protein database.

摘要

背景

BLAST算法通过相互比较生物序列来确定共享基序和共同祖先。然而,将所有非冗余(NR)序列与所有其他NR序列进行比较是一项计算量很大的任务。我们开发了NBLAST,作为BLAST序列比较程序家族的一种集群计算机实现方式,目的是生成预计算的BLAST比对结果和NR序列的邻居列表。

结果

NBLAST执行启发式BLAST算法并生成一个详尽的比对数据库,但它只计算可能的N²个比对结果中的比对(即上三角部分),其中N是所有要比较的序列集合。一种任务划分算法允许在所有集群节点上进行集群计算,并且NBLAST主进程会为每个序列记录生成一个BLAST序列比对数据库和一个序列邻居列表。生成的序列比对和邻居数据库通过C/C++和PERL应用程序编程接口(API)为SeqHound查询系统提供服务。

结论

NBLAST为预计算的BLAST比对和邻居查询提供了一种替代NCBI远程Entrez系统的本地方法。在我们拥有216个处理器的450 MHz PIII集群上,NBLAST需要约24小时来为非冗余蛋白质数据库中目前的850000种蛋白质计算邻居。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db42/113272/72bf2c91c3bb/1471-2105-3-13-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验