Suppr超能文献

基于傅里叶变换的哈希函数进行快速蛋白质片段搜索。

Rapid protein fragment search using hash functions based on the Fourier transform.

作者信息

Akutsu T, Onizuka K, Ishikawa M

机构信息

Human Genome Center, University of Tokyo, Japan.

出版信息

Comput Appl Biosci. 1997 Aug;13(4):357-64. doi: 10.1093/bioinformatics/13.4.357.

Abstract

MOTIVATION

Since the protein structure database has been growing very rapidly in recent years, the development of efficient methods for searching for similar structures is very important.

RESULTS

This paper presents a novel method for searching for similar fragments of proteins. In this method, a hash vector (a vector of real numbers) is associated with each fixed-length fragment of three-dimensional protein structure. Each vector consists of low-frequency components of the Fourier-like spectrum for the distances between C alpha atoms and the centroid. Then, we can analyze the similarity between fragments by evaluating the difference between hash vectors. The novel aspect of the method is that the following property is proved theoretically: if the root mean square distance between two fragments is small, then the distance between the hash vectors is small. Several variants of this method were compared with a naive method and a previous method using PDB data. The results show that the fastest one among the variants is 18-80 times faster than the naive method, and 3-10 times faster than the previous method.

摘要

动机

近年来蛋白质结构数据库增长迅速,因此开发高效的相似结构搜索方法非常重要。

结果

本文提出了一种搜索蛋白质相似片段的新方法。在该方法中,一个哈希向量(实数值向量)与三维蛋白质结构的每个固定长度片段相关联。每个向量由Cα原子与质心之间距离的类傅里叶谱的低频分量组成。然后,我们可以通过评估哈希向量之间的差异来分析片段之间的相似性。该方法的新颖之处在于从理论上证明了以下性质:如果两个片段之间的均方根距离较小,那么哈希向量之间的距离也较小。使用PDB数据将该方法的几个变体与一种朴素方法和一种先前方法进行了比较。结果表明,这些变体中最快的比朴素方法快18至80倍,比先前方法快3至10倍。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验