Karp P D
Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA.
J Comput Biol. 1995 Winter;2(4):573-86. doi: 10.1089/cmb.1995.2.573.
To realize the full potential of biological databases (DBs) requires more than the interactive, hypertext flavor of database interoperation that is now so popular in the bioinformatics community. Interoperation based on declarative queries to multiple network-accessible databases will support analyses and investigations that are orders of magnitude faster and more powerful than what can be accomplished through interactive navigation. I present a vision of the capabilities that a query-based interoperation infrastructure should provide, and identify assumptions underlying, and requirements of, this vision. I then propose an architecture for query-based interoperation that includes a number of novel components of an information infrastructure for molecular biology. These components include a knowledge base that describes relationships among the conceptualizations used in different biological databases, a module that can determine the DBs that are relevant to a particular query, a module that can translate a query and its results from one conceptualization to another, a collection of DB drivers that provide uniform physical access to different database management systems, a suite of translators that can interconvert among different database schema languages, and a database that describes the network location and access methods for biological databases. A number of the components are translators that bridge the heterogeneities that exist between biological DBs at several different levels, including the conceptual level, the data model, the query language, and data formats.
要充分发挥生物数据库(DBs)的全部潜力,需要的不仅仅是目前在生物信息学界非常流行的数据库互操作的交互式超文本风格。基于对多个网络可访问数据库的声明式查询进行互操作,将支持比通过交互式导航所能完成的分析和研究快几个数量级且更强大的分析和研究。我提出了基于查询的互操作基础设施应具备的功能愿景,并确定了这一愿景的潜在假设和要求。然后,我提出了一种基于查询的互操作架构,其中包括分子生物学信息基础设施的一些新颖组件。这些组件包括一个知识库,用于描述不同生物数据库中使用的概念之间的关系;一个模块,可确定与特定查询相关的数据库;一个模块,可将查询及其结果从一种概念转换为另一种概念;一组数据库驱动程序,提供对不同数据库管理系统的统一物理访问;一套翻译器,可在不同数据库模式语言之间进行相互转换;以及一个数据库,用于描述生物数据库的网络位置和访问方法。许多组件都是翻译器,可在生物数据库之间存在的几个不同层面的异构性之间架起桥梁,包括概念层面、数据模型、查询语言和数据格式。