Ideaconsult Ltd, 4 A. Kanchev Str., Sofia 1000, Bulgaria phone/fax: +359886802011.
University of Plovdiv, Department of Analytical Chemistry and Computer Chemistry, 24 Tsar Asen Str., Plovdiv 4000, Bulgaria.
Mol Inform. 2011 Aug;30(8):707-20. doi: 10.1002/minf.201100028. Epub 2011 Aug 4.
We present new developments in the AMBIT open source software package for efficient searching of chemical structures and structural fragments. AMBIT-SMARTS is a Java based software built on top of The Chemistry Development Kit. The AMBIT-SMARTS parser implements the entire SMARTS language specification with several syntax extensions that enable support for custom modifications introduced by third party software packages such as OpenEye, MOE and OpenBabel. The goal of yet another open-source SMARTS parser implementation is to achieve better performance and compatibility with multiple existing flavours of the SMARTS language, as well as to provide utilities for running efficient SMARTS queries in large structural databases. We describe a combination of approaches towards lowering the computational cost and improving the response time of substructure queries. An exhaustive comparison of the AMBIT algorithm with several subgraph isomorphism implementations is performed. To demonstrate the performance of the entire system from an end-user point of view, response time statistics for Web service substructure search queries against a database of 4.5 M structures are also reported. The package has wide applicability in the implementation of various chemoinformatics tasks. It has already been used in several projects dealing with descriptor calculation and predictive algorithms, database queries, web applications and web services.
我们展示了用于高效搜索化学结构和结构片段的 AMBIT 开源软件包的新进展。AMBITSMARTS 是一个基于 Java 的软件,构建在化学开发工具包之上。AMBITSMARTS 解析器实现了整个 SMARTS 语言规范,具有几个语法扩展,能够支持第三方软件包(如 OpenEye、MOE 和 OpenBabel)引入的自定义修改。另一个开源 SMARTS 解析器实现的目标是实现更好的性能和与多种现有 SMARTS 语言版本的兼容性,并提供在大型结构数据库中运行高效 SMARTS 查询的实用程序。我们描述了降低计算成本和提高子结构查询响应时间的组合方法。对 AMBIT 算法与几个子图同构实现进行了详尽的比较。为了从最终用户的角度展示整个系统的性能,还报告了针对 450 万个结构数据库的 Web 服务子结构搜索查询的响应时间统计信息。该软件包在实现各种化学信息学任务方面具有广泛的适用性。它已经在几个涉及描述符计算和预测算法、数据库查询、Web 应用程序和 Web 服务的项目中得到了应用。