Pratt Wanda, Yetisgen-Yildiz Meliha
Biomedical and Health Informatics, School of Medicine, University of Washington, Seattle, USA.
AMIA Annu Symp Proc. 2003;2003:529-33.
Although huge amounts of unstructured text are available as a rich source of biomedical knowledge, to process this unstructured knowledge requires tools that identify concepts from free-form text. MetaMap is one tool that system developers in biomedicine have commonly used for such a task, but few have studied how well it accomplishes this task in general. In this paper, we report on a study that compares MetaMap's performance against that of six people. Such studies are challenging because the task is inherently subjective and establishing consensus is difficult. Nonetheless, for those concepts that subjects generally agreed on, MetaMap was able to identify most concepts, if they were represented in the UMLS. However, MetaMap identified many other concepts that peo-ple did not. We also report on our analysis of the types of failures that MetaMap exhibited as well as trends in the way people chose to identify concepts.
尽管大量非结构化文本作为生物医学知识的丰富来源可供使用,但处理这种非结构化知识需要能够从自由格式文本中识别概念的工具。MetaMap是生物医学领域的系统开发人员通常用于此类任务的一种工具,但总体上很少有人研究它在完成这项任务方面的表现如何。在本文中,我们报告了一项将MetaMap的性能与六个人的性能进行比较的研究。此类研究具有挑战性,因为该任务本质上是主观的,并且达成共识很困难。尽管如此,对于受试者普遍认同的那些概念,如果它们在统一医学语言系统(UMLS)中有表示,MetaMap能够识别出大多数概念。然而,MetaMap识别出了许多其他人未识别出的概念。我们还报告了对MetaMap所表现出的失败类型以及人们选择识别概念方式的趋势的分析。