Shyu Chi-Ren, Chi Pin-Hao, Scott Grant, Xu Dong
Medical and Biological Digital Library Research Lab, 322 Engineering Building North, Department of Computer Science, University of Missouri, Columbia, MO 65211-2060, USA.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W572-5. doi: 10.1093/nar/gkh436.
We have developed a web server (ProteinDBS) for the life science community to search for similar protein tertiary structures in real time. This system applies computer visualization techniques to extract the predominant visual patterns encoded in two-dimensional distance matrices generated from the three-dimensional coordinates of protein chains. When meaningful contents, represented in a multi-dimensional feature space, have been extracted from distance matrices, an advanced indexing structure, Entropy Balanced Statistical (EBS) k-d tree, is utilized to index the data. Our system is able to return search results in ranked order from a database with 46 075 chains in seconds, exhibiting a reasonably high degree of precision. To our knowledge, this is the first real-time search engine for protein structure comparison. ProteinDBS provides two types of query method: query by Protein Data Bank protein chain ID and by new structures uploaded by users. The system is hosted at http://ProteinDBS.rnet.missouri.edu.
我们为生命科学领域开发了一个网络服务器(ProteinDBS),用于实时搜索相似的蛋白质三级结构。该系统应用计算机可视化技术,从由蛋白质链的三维坐标生成的二维距离矩阵中提取主要的视觉模式。当从距离矩阵中提取出在多维特征空间中表示的有意义内容后,利用一种先进的索引结构——熵平衡统计(EBS)k-d树对数据进行索引。我们的系统能够在数秒内从包含46075条链的数据库中按排名顺序返回搜索结果,显示出相当高的精确度。据我们所知,这是首个用于蛋白质结构比较的实时搜索引擎。ProteinDBS提供两种查询方法:通过蛋白质数据库蛋白质链ID查询和通过用户上传的新结构查询。该系统托管于http://ProteinDBS.rnet.missouri.edu 。