Whalen Gregory
Department of Computer Science, Columbia University, New York, NY 10027, USA.
AMIA Annu Symp Proc. 2005;2005:814-8.
We present a method for automated medical textbook and encyclopedia summarization. Using statistical sentence extraction and semantic relationships, we extract sentences from text returned as part of an existing textbook search (similar to a book index). Our system guides users to the information they desire by summarizing the content of each relevant chapter or section returned in the search. The summary is tailored to contain sentences that specifically address the user's search terms. Our clustering method selects sentences that contain concepts specifically addressing the context of the query term in each of the returned sections. Our method examines conceptual relationships from the UMLS and selects clusters of concepts using Expectation Maximization (EM). Sentences associated with the concept clusters are shown to the user. We evaluated whether our extracted summary provides a suitable answer to the user's question.
我们提出了一种自动总结医学教科书和百科全书的方法。利用统计句子提取和语义关系,我们从作为现有教科书搜索(类似于书籍索引)一部分返回的文本中提取句子。我们的系统通过总结搜索中返回的每个相关章节或部分的内容,引导用户获取他们想要的信息。总结经过定制,包含专门针对用户搜索词的句子。我们的聚类方法选择包含在每个返回部分中专门针对查询词上下文的概念的句子。我们的方法检查来自统一医学语言系统(UMLS)的概念关系,并使用期望最大化(EM)算法选择概念簇。与概念簇相关的句子会展示给用户。我们评估了我们提取的总结是否能为用户的问题提供合适的答案。