Suppr超能文献

核酸和蛋白质数据库的快速相似性搜索。

Rapid similarity searches of nucleic acid and protein data banks.

作者信息

Wilbur W J, Lipman D J

出版信息

Proc Natl Acad Sci U S A. 1983 Feb;80(3):726-30. doi: 10.1073/pnas.80.3.726.

Abstract

With the development of large data banks of protein and nucleic acid sequences, the need for efficient methods of searching such banks for sequences similar to a given sequence has become evident. We present an algorithm for the global comparison of sequences based on matching k-tuples of sequence elements for a fixed k. The method results in substantial reduction in the time required to search a data bank when compared with prior techniques of similarity analysis, with minimal loss in sensitivity. The algorithm has also been adapted, in a separate implementation, to produce rigorous sequence alignments. Currently, using the DEC KL-10 system, we can compare all sequences in the entire Protein Data Bank of the National Biomedical Research Foundation with a 350-residue query sequence in less than 3 min and carry out a similar analysis with a 500-base query sequence against all eukaryotic sequences in the Los Alamos Nucleic Acid Data Base in less than 2 min.

摘要

随着蛋白质和核酸序列大型数据库的发展,对于有效搜索此类数据库以寻找与给定序列相似的序列的方法的需求变得明显。我们提出了一种基于固定k的序列元素k元组匹配的序列全局比较算法。与先前的相似性分析技术相比,该方法显著减少了搜索数据库所需的时间,同时灵敏度损失最小。该算法在另一个实现中也经过了调整,以生成严格的序列比对。目前,使用DEC KL - 10系统,我们可以在不到3分钟的时间内将国家生物医学研究基金会整个蛋白质数据库中的所有序列与一个350个残基的查询序列进行比较,并在不到2分钟的时间内将一个500个碱基的查询序列与洛斯阿拉莫斯核酸数据库中的所有真核序列进行类似分析。

相似文献

7
Improved sensitivity of biological sequence database searches.生物序列数据库搜索灵敏度的提高。
Comput Appl Biosci. 1990 Jul;6(3):237-45. doi: 10.1093/bioinformatics/6.3.237.
8
Sequence search on a supercomputer.在超级计算机上进行序列搜索。
Nucleic Acids Res. 1986 Jan 10;14(1):57-64. doi: 10.1093/nar/14.1.57.
9
Database similarity searches.数据库相似性搜索。
Methods Mol Biol. 2008;484:361-78. doi: 10.1007/978-1-59745-398-1_24.

引用本文的文献

9
Conserved Motifs and Domains in Members of .. 成员中的保守基序和结构域
Cells. 2022 Jan 11;11(2):230. doi: 10.3390/cells11020230.

本文引用的文献

1
Pattern recognition in genetic sequences.基因序列中的模式识别。
Proc Natl Acad Sci U S A. 1979 Jul;76(7):3041. doi: 10.1073/pnas.76.7.3041.
2
Comparative biosequence metrics.比较生物序列度量
J Mol Evol. 1981;18(1):38-46. doi: 10.1007/BF01733210.
3
Identification of common molecular subsequences.常见分子子序列的鉴定
J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.
8
An improved method of testing for evolutionary homology.一种改进的进化同源性测试方法。
J Mol Biol. 1966 Mar;16(1):9-16. doi: 10.1016/s0022-2836(66)80258-9.
10
Matching sequences under deletion-insertion constraints.在缺失-插入约束下匹配序列。
Proc Natl Acad Sci U S A. 1972 Jan;69(1):4-6. doi: 10.1073/pnas.69.1.4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验