Suppr超能文献

一个带有搜索算法的快速访问基序数据库(RAMdb),用于检索核酸或蛋白质数据库中的模式。

A rapid access motif database (RAMdb) with a search algorithm for the retrieval patterns in nucleic acids or protein databanks.

作者信息

Fondrat C, Dessen P

机构信息

CIT12 (Centre Interuniversitaire de Traitement de l'Information), Universite Paris, France.

出版信息

Comput Appl Biosci. 1995 Jun;11(3):273-9. doi: 10.1093/bioinformatics/11.3.273.

Abstract

We present here a codification structure, entirely interfaced with the main packages for biomolecule database management, associated with a new search algorithm to retrieve quickly a sequence in a database. This system is derived from a method previously proposed for homology search in databanks with a preprocessed codification of an entire database in which all the overlapping subsequences of a specific length in a sequence were converted into a code and stored in a hash-coding file. This new algorithm is designed for an improved use of the codification. It is based on the recognition of the rarest strings which characterize the query sequence and the intersection of sorted lists read in the codification structure. The system is applicable to both nucleic acid and protein sequences and is used to find patterns in databanks or large sets of sequences. A few examples of applications are given. In addition, the comparison of our method with existing ones shows that this new approach speeds up the search for query patterns in large data sets.

摘要

我们在此展示一种编码结构,它与生物分子数据库管理的主要程序包完全对接,并关联一种新的搜索算法,以便在数据库中快速检索序列。该系统源自先前提出的一种用于数据库同源性搜索的方法,此方法对整个数据库进行预处理编码,即将序列中特定长度的所有重叠子序列转换为代码并存储在哈希编码文件中。这种新算法旨在更有效地利用编码。它基于对表征查询序列的最稀有字符串的识别以及在编码结构中读取的排序列表的交集。该系统适用于核酸和蛋白质序列,用于在数据库或大量序列集中查找模式。文中给出了一些应用示例。此外,将我们的方法与现有方法进行比较表明,这种新方法加快了在大数据集中搜索查询模式的速度。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验