Bishop M, Thompson E
Nucleic Acids Res. 1984 Jul 11;12(13):5471-4. doi: 10.1093/nar/12.13.5471.
An extremely fast method of searching a nucleic acid sequence database against a probe sequence is described. The method is based on the detection of deviation from expected number and deviation from random spatial distribution of sub-sequences which are unique within a sequence, and shared between that sequence and the probe. On an IBM 3081 computer, total search of an encoded form of the EMBL nucleic acid sequence database with a 1 kbase probe sequence is completed in a few seconds. Previous best methods for a similar task required a few minutes.
本文描述了一种针对探针序列搜索核酸序列数据库的极快速方法。该方法基于检测序列内独特且该序列与探针共有的子序列的预期数量偏差和随机空间分布偏差。在一台IBM 3081计算机上,用一个1千碱基的探针序列对EMBL核酸序列数据库的编码形式进行全面搜索只需几秒即可完成。之前用于类似任务的最佳方法则需要几分钟。