Chen I M, Kosky A S, Markowitz V M, Szeto E, Topaloglou T
Bioinformatics Systems Division, Gene Logic Inc., Berkeley, CA 94704, USA.
Proc Int Conf Intell Syst Mol Biol. 1998;6:43-51.
Existing query interfaces for biological databases are either based on fixed forms or textual query languages. Users of a fixed form-based query interface are limited to performing some pre-defined queries providing a fixed view of the underlying database, while users of a free text query language-based interface have to understand the underlying data models, specific query languages and application schemas in order to formulate queries. Further, operations on application-specific complex data (e.g., DNA sequences, proteins), which are usually provided by a variety of software packages with their own format requirements and peculiarities, are not available as part of, nor integrated with biological query interfaces. In this paper, we describe generic tools that provide powerful and flexible support for interactively exploring biological databases in a uniform and consistent way, that is via common data models, formats, and notations, in the framework of the Object-Protocol Model (OPM). These tools include (i) a Java graphical query construction tool with support for automatic generation of Web query forms that can be either used for further specifying conditions, or can be saved and customized; (ii) query processors for interpreting and executing queries that may involve complex application-specific objects, and that could span multiple heterogeneous databases and file systems; and (iii) utilities for automatic generation of HTML pages containing query results, that can be browsed using a Web browser. These tools avoid the restrictions imposed by traditional fixed-form query interfaces, while providing users with simple and intuitive facilities for formulating ad-hoc queries across heterogeneous databases, without the need to understand the underlying data models and query languages.
现有的生物数据库查询接口要么基于固定表单,要么基于文本查询语言。基于固定表单的查询接口的用户只能执行一些预定义的查询,这些查询提供了底层数据库的固定视图,而基于自由文本查询语言的接口的用户则必须了解底层数据模型、特定查询语言和应用模式才能制定查询。此外,针对特定应用的复杂数据(如DNA序列、蛋白质)的操作,通常由各种具有自身格式要求和特点的软件包提供,但这些操作并非生物查询接口的一部分,也未与生物查询接口集成。在本文中,我们描述了一些通用工具,这些工具通过通用数据模型、格式和表示法,在对象协议模型(OPM)的框架内,以统一和一致的方式为交互式探索生物数据库提供强大而灵活的支持。这些工具包括:(i)一个Java图形化查询构建工具,支持自动生成Web查询表单,这些表单既可以用于进一步指定条件,也可以保存和定制;(ii)查询处理器,用于解释和执行可能涉及复杂的特定应用对象的查询,这些查询可能跨越多个异构数据库和文件系统;(iii)用于自动生成包含查询结果的HTML页面的实用程序,可以使用Web浏览器进行浏览。这些工具避免了传统固定表单查询接口所带来的限制,同时为用户提供了简单直观的工具,用于跨异构数据库制定即席查询,而无需了解底层数据模型和查询语言。