并行计算与FASTA：面对快速序列比较算法中并行数据库搜索的问题。

Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm.

作者信息

Miller P L, Nadkarni P M, Carriero N M

机构信息

Department of Anesthesiology, Yale University School of Medicine, New Haven, CT 06510.

出版信息

Comput Appl Biosci. 1991 Jan;7(1):71-8. doi: 10.1093/bioinformatics/7.1.71.

DOI:10.1093/bioinformatics/7.1.71

PMID:2004277

Abstract

We have parallelized the FASTA algorithm for biological sequence comparison using Linda, a machine-independent parallel programming language. The resulting parallel program runs on a variety of different parallel machines. A straight-forward parallelization strategy works well if the amount of computation to be done is relatively large. When the amount of computation is reduced, however, disk I/O becomes a bottleneck which may prevent additional speed-up as the number of processors is increased. The paper describes the parallelization of FASTA, and uses FASTA to illustrate the I/O bottleneck problem that may arise when performing parallel database search with a fast sequence comparison algorithm. The paper also describes several program design strategies that can help with this problem. The paper discusses how this bottleneck is an example of a general problem that may occur when parallelizing, or otherwise speeding up, a time-consuming computation.

摘要

我们使用Linda（一种与机器无关的并行编程语言）对用于生物序列比较的FASTA算法进行了并行化处理。所得的并行程序可在各种不同的并行机器上运行。如果要完成的计算量相对较大，一种直接的并行化策略会很有效。然而，当计算量减少时，磁盘I/O就会成为瓶颈，随着处理器数量的增加，这可能会阻碍进一步提速。本文描述了FASTA的并行化过程，并使用FASTA来说明在用快速序列比较算法进行并行数据库搜索时可能出现的I/O瓶颈问题。本文还描述了几种有助于解决此问题的程序设计策略。本文讨论了这种瓶颈如何是在对耗时计算进行并行化或加速时可能出现的一个普遍问题的示例。