Lemoine E, Quinqueton J, Sallantin J
Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier UMR 9928 Université Montpellier II/CNRS, France.
Proc Int Conf Intell Syst Mol Biol. 1994;2:269-75.
Homology detection in large data bases is probably the most time consuming operation in molecular genetic computing systems. Moreover, the progresses made all around the world concerning the mapping and sequencing of the genome of Homo Sapiens and other species have increased the size of data bases exponentially. Therefore even the best workstation would not be able to reach the scanning speed required. In order to answer this need we propose an algorithm, A2R2, and its implementation on a massively parallel system. Basically, two kinds of algorithms are used to search in molecular genetic data bases. The first kind is based on dynamic programming and the second on word processing, A2R2 belongs to the second kind. The structure of the motif (pattern) searched by A2R2 can support those from FAST, BLAST and FLASH algorithms. After a short presentation of the reconfigurable hardware concept and technology used in our massively parallel accelerator we present the A2R2 implementation. This parallel implementation outperforms any kind of previously published genetic data base scanning hardware or algorithms. We report up to 25 million nucleotides per scanning seconds as our best results.
在大型数据库中进行同源性检测可能是分子遗传计算系统中最耗时的操作。此外,世界各地在人类和其他物种基因组图谱绘制与测序方面取得的进展使数据库规模呈指数级增长。因此,即使是最好的工作站也无法达到所需的扫描速度。为了满足这一需求,我们提出了一种算法A2R2及其在大规模并行系统上的实现。基本上,有两种算法用于在分子遗传数据库中进行搜索。第一种基于动态规划,第二种基于文字处理,A2R2属于第二种。A2R2搜索的基序(模式)结构可以支持FAST、BLAST和FLASH算法的基序。在简要介绍了我们大规模并行加速器中使用的可重构硬件概念和技术之后,我们展示了A2R2的实现。这种并行实现优于之前发布的任何类型的遗传数据库扫描硬件或算法。我们报告的最佳结果是每秒扫描多达2500万个核苷酸。