一个带有搜索算法的快速访问基序数据库（RAMdb），用于检索核酸或蛋白质数据库中的模式。

A rapid access motif database (RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or protein databanks.

作者信息

Fondrat C, Dessen P

机构信息

CIT12 (Centre Interuniversitaire de Traitement de l'Information), Universite Paris, France.

出版信息

Comput Appl Biosci. 1995 Jun;11(3):273-9. doi: 10.1093/bioinformatics/11.3.273.

DOI:10.1093/bioinformatics/11.3.273

PMID:7583695

Abstract

We present here a codification structure, entirely interfaced with the main packages for biomolecule database management, associated with a new search algorithm to retrieve quickly a sequence in a database. This system is derived from a method previously proposed for homology search in databanks with a preprocessed codification of an entire database in which all the overlapping subsequences of a specific length in a sequence were converted into a code and stored in a hash-coding file. This new algorithm is designed for an improved use of the codification. It is based on the recognition of the rarest strings which characterize the query sequence and the intersection of sorted lists read in the codification structure. The system is applicable to both nucleic acid and protein sequences and is used to find patterns in databanks or large sets of sequences. A few examples of applications are given. In addition, the comparison of our method with existing ones shows that this new approach speeds up the search for query patterns in large data sets.

摘要

我们在此展示一种编码结构，它与生物分子数据库管理的主要程序包完全对接，并关联一种新的搜索算法，以便在数据库中快速检索序列。该系统源自先前提出的一种用于数据库同源性搜索的方法，此方法对整个数据库进行预处理编码，即将序列中特定长度的所有重叠子序列转换为代码并存储在哈希编码文件中。这种新算法旨在更有效地利用编码。它基于对表征查询序列的最稀有字符串的识别以及在编码结构中读取的排序列表的交集。该系统适用于核酸和蛋白质序列，用于在数据库或大量序列集中查找模式。文中给出了一些应用示例。此外，将我们的方法与现有方法进行比较表明，这种新方法加快了在大数据集中搜索查询模式的速度。

相似文献

A rapid access motif database (RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or protein databanks.一个带有搜索算法的快速访问基序数据库（RAMdb），用于检索核酸或蛋白质数据库中的模式。

Comput Appl Biosci. 1995 Jun;11(3):273-9. doi: 10.1093/bioinformatics/11.3.273.

Principle of codification for quick comparisons with the entire biomolecule databanks and associated programs in FORTRAN 77.用于与整个生物分子数据库以及 FORTRAN 77 相关程序进行快速比较的编码原则。

Nucleic Acids Res. 1986 Jan 10;14(1):197-204. doi: 10.1093/nar/14.1.197.

A novel sequence similarity searching and visualization method based on overlappingly translated nucleic acids: the blastNP.一种基于重叠翻译核酸的新型序列相似性搜索与可视化方法：blastNP。

Med Hypotheses. 2004;62(4):568-74. doi: 10.1016/j.mehy.2003.11.020.

Comput Biol Chem. 2010 Apr;34(2):131-6. doi: 10.1016/j.compbiolchem.2010.03.007. Epub 2010 Apr 4.

LASSAP, a LArge Scale Sequence compArison Package.LASSAP，一个大规模序列比较程序包。

Comput Appl Biosci. 1997 Apr;13(2):137-43. doi: 10.1093/bioinformatics/13.2.137.

Proc Natl Acad Sci U S A. 1983 Feb;80(3):726-30. doi: 10.1073/pnas.80.3.726.

The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.生物提示框：一种用于在生物数据库中搜索的基于本体的聚类工具。

BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S8. doi: 10.1186/1471-2105-8-S1-S8.

A Firefly Algorithm-based Approach for Pseudo-Relevance Feedback: Application to Medical Database.一种基于萤火虫算法的伪相关反馈方法：在医学数据库中的应用

J Med Syst. 2016 Nov;40(11):240. doi: 10.1007/s10916-016-0603-5. Epub 2016 Sep 27.

Data bank homology search algorithm with linear computation complexity.具有线性计算复杂度的数据库同源性搜索算法。

Comput Appl Biosci. 1994 Jun;10(3):319-22. doi: 10.1093/bioinformatics/10.3.319.

Filtering redundancies for sequence similarity search programs.为序列相似性搜索程序过滤冗余信息。

J Biomol Struct Dyn. 2005 Feb;22(4):487-92. doi: 10.1080/07391102.2005.10507020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一个带有搜索算法的快速访问基序数据库（RAMdb），用于检索核酸或蛋白质数据库中的模式。

A rapid access motif database (RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or protein databanks.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献