National ICT Australia Victoria Research Laboratory, The University of Melbourne, Melbourne, Victoria, Australia.
BMC Bioinformatics. 2010 Jan 18;11 Suppl 1(Suppl 1):S46. doi: 10.1186/1471-2105-11-S1-S46.
Protein structure comparison is a fundamental task in structural biology. While the number of known protein structures has grown rapidly over the last decade, searching a large database of protein structures is still relatively slow using existing methods. There is a need for new techniques which can rapidly compare protein structures, whilst maintaining high matching accuracy.
We have developed IR Tableau, a fast protein comparison algorithm, which leverages the tableau representation to compare protein tertiary structures. IR tableau compares tableaux using information retrieval style feature indexing techniques. Experimental analysis on the ASTRAL SCOP protein structural domain database demonstrates that IR Tableau achieves two orders of magnitude speedup over the search times of existing methods, while producing search results of comparable accuracy.
We show that it is possible to obtain very significant speedups for the protein structure comparison problem, by employing an information retrieval style approach for indexing proteins. The comparison accuracy achieved is also strong, thus opening the way for large scale processing of very large protein structure databases.
蛋白质结构比较是结构生物学的基本任务。虽然在过去十年中已知蛋白质结构的数量迅速增长,但使用现有方法搜索大型蛋白质结构数据库仍然相对较慢。需要新的技术,这些技术可以快速比较蛋白质结构,同时保持较高的匹配准确性。
我们开发了 IR Tableau,一种快速的蛋白质比较算法,它利用 tableau 表示来比较蛋白质的三级结构。IR tableau 使用信息检索样式的特征索引技术来比较 tableau。在 ASTRAL SCOP 蛋白质结构域数据库上的实验分析表明,IR Tableau 相对于现有方法的搜索时间实现了两个数量级的加速,同时产生了具有可比性的搜索结果准确性。
我们表明,通过采用信息检索样式的索引方法对蛋白质进行索引,可以大大加快蛋白质结构比较问题的速度。所达到的比较准确性也很强,从而为处理非常大的蛋白质结构数据库提供了途径。