Purcell G P, Shortliffe E H
Section on Medical Informatics, Stanford University School of Medicine, California 94305-5479, USA.
Proc Annu Symp Comput Appl Med Care. 1995:851-7.
Conventional methods for retrieving information from the medical literature are imprecise and inefficient. Information retrieval systems employ unmanageable indexing vocabularies or use full-text representations that overwhelm the user with irrelevant information. This paper describes a document representation designed to improve the precision of searching in textual databases without significantly compromising recall. The representation augments simple text word representations with contextual models that reflect recurring semantic themes in clinical publications. Using this representation, a searcher may indicate both the terms of interest and the contexts in which they should occur. The contexts limit the potential interpretations of text words, and thus form the basis for more precise searching. In this paper, we discuss the shortcomings of traditional retrieval systems and describe our context-based representation. Improved retrieval performance with contextual models is illustrated by example, and a more extensive study is proposed. We present an evaluation of the contextual models as an indexing scheme, using a variation of the traditional inter-indexer consistency experiments, and we demonstrate that contextual indexing is reproducible by minimally trained physicians and medical students.
从医学文献中检索信息的传统方法不精确且效率低下。信息检索系统使用难以管理的索引词汇表,或使用全文表示法,从而使大量无关信息困扰用户。本文描述了一种文档表示方法,旨在提高文本数据库搜索的精确性,同时又不会显著降低召回率。这种表示方法通过上下文模型增强简单的文本单词表示,这些上下文模型反映了临床出版物中反复出现的语义主题。使用这种表示方法,搜索者既可以指明感兴趣的术语,也可以指明它们应该出现的上下文。这些上下文限制了文本单词的潜在解释,从而构成了更精确搜索的基础。在本文中,我们讨论了传统检索系统的缺点,并描述了我们基于上下文的表示方法。通过示例说明了使用上下文模型提高检索性能的情况,并提出了一项更广泛的研究。我们使用传统的索引编制者间一致性实验的变体,对上下文模型作为一种索引方案进行了评估,并证明了经过最少培训的医生和医学生能够重现上下文索引。