Hersh W, Price S, Donohoe L
Division of Medical Informatics & Outcomes Research, Oregon Health Sciences University, Portland, Oregon, USA.
Proc AMIA Symp. 2000:344-8.
Assess query expansion using thesaurus relationships and definitions in the UMLS Metathesaurus for improving searching performance.
The queries from a MEDLINE test collection (OHSUMED) were expanded using synonym, hierarchical, and related term information as well as term definitions from the UMLS Metathesaurus. Documents were retrieved from a word-statistical retrieval system and assessed for recall and precision based on relevance judgments from the test collection.
All types of query expansion degraded aggregate retrieval performance as measured by recall and precision, although 38.6% of the queries with synonym expansion and up to 29.7% of the queries with hierarchical expansion showed improvement.
Thesaurus-based query expansion causes a decline in retrieval performance generally but improves it in specific instances. Further research must focus on identifying instances where performance improves and how it can be exploited by real users.
评估使用统一医学语言系统(UMLS)元词表中的同义词库关系和定义进行查询扩展,以提高搜索性能。
利用UMLS元词表中的同义词、层次关系词和相关词信息以及术语定义,对医学文献数据库联机检索系统(OHSUMED)测试集中的查询进行扩展。从一个词统计检索系统中检索文档,并根据测试集的相关性判断评估召回率和精确率。
尽管38.6%的同义词扩展查询和高达29.7%的层次扩展查询显示出性能提升,但通过召回率和精确率衡量,所有类型的查询扩展都会降低总体检索性能。
基于同义词库的查询扩展通常会导致检索性能下降,但在特定情况下会有所改善。进一步的研究必须集中于识别性能提高的情况以及实际用户如何利用这些情况。