Friedlin Jeff, Overhage Marc
Regenstrief Institute, Inc., Indiana University School of Medicine, Indianapolis, IN, USA.
AMIA Annu Symp Proc. 2011;2011:435-44. Epub 2011 Oct 22.
We performed an evaluation of the Unified Medical Language System (UMLS) in representing concepts derived from medical narrative documents from three domains: chest x-ray reports, discharge summaries and admission notes. We detected concepts in these documents by identifying noun phrases (NPs) and N-grams, including unigrams (single words), bigrams (word pairs) and trigrams (word triples). After removing NPs and N-grams that did not represent discrete clinical concepts, we processed the remaining with the UMLS MetaMap program. We manually reviewed the results of MetaMap processing to determine whether MetaMap found full, partial or no representation of the concept. For full representations, we determined whether post-coordination was required. Our results showed that a large portion of concepts found in clinical narrative documents are either unrepresented or poorly represented in the current version of the UMLS Metathesaurus and that post-coordination was often required in order to fully represent a concept.
我们对统一医学语言系统(UMLS)在表示源自三个领域医学叙述文档的概念方面进行了评估,这三个领域分别是:胸部X光报告、出院小结和入院记录。我们通过识别名词短语(NPs)和N元语法来检测这些文档中的概念,包括一元语法(单个单词)、二元语法(单词对)和三元语法(单词三元组)。在去除不代表离散临床概念的名词短语和N元语法后,我们使用UMLS MetaMap程序处理剩余部分。我们手动审查了MetaMap处理的结果,以确定MetaMap是否找到了该概念的完整、部分或无表示形式。对于完整表示形式,我们确定是否需要后协调。我们的结果表明,临床叙述文档中发现的很大一部分概念在当前版本的UMLS叙词表中要么未被表示,要么表示不佳,并且通常需要后协调才能完全表示一个概念。