Murphy S N, Morgan M M, Barnett G O, Chueh H C
Laboratory of Computer Science, Massachusetts General Hospital, Boston, USA.
Proc AMIA Symp. 1999:892-6.
Over the past two years we have reviewed and implemented the specifications for a large relational database (a data warehouse) to find research cohorts from data similar to that contained within the clinical COSTAR database at the Massachusetts General Hospital. A review of 16 years of COSTAR research queries was conducted to determine the most common search strategies. These search strategies are relevant to the general research community, because they use the Medical Query Language (MQL) developed for the COSTAR M database which is extremely flexible (much more so than SQL) and allows searches by coded fields, text reports, and laboratory values in a completely ad hoc fashion. By reviewing these search strategies, we were able to obtain user specifications for a research oriented healthcare data warehouse that could support 90% of the queries. The data warehouse was implemented in a relational database using the star schema, allowing for highly optimized analytical processing. This allowed queries that performed slowly in the M database to be performed very rapidly in the relational database. It also allowed the data warehouse to scale effectively.
在过去两年里,我们审查并实施了一个大型关系数据库(数据仓库)的规范,以便从与麻省总医院临床COSTAR数据库所含数据相似的数据中找到研究队列。我们对16年的COSTAR研究查询进行了审查,以确定最常见的搜索策略。这些搜索策略与一般研究群体相关,因为它们使用为COSTAR M数据库开发的医学查询语言(MQL),该语言极其灵活(比SQL灵活得多),允许以完全临时的方式通过编码字段、文本报告和实验室值进行搜索。通过审查这些搜索策略,我们能够获得一个面向研究的医疗数据仓库的用户规范,该数据仓库可以支持90%的查询。该数据仓库在一个使用星型模式的关系数据库中实现,允许进行高度优化的分析处理。这使得在M数据库中执行缓慢的查询在关系数据库中能够非常快速地执行。它还使数据仓库能够有效地扩展。