Spackman K A, Hersh W R
Biomedical Information Communication Center Oregon Health Sciences University, Portland, USA.
Proc AMIA Annu Fall Symp. 1996:155-8.
We evaluated the ability of two natural language parsers, CLARIT and the Xerox Tagger, to identify simple, noun phrases in medical discharge summaries. In twenty randomly selected discharge summaries, there were 1909 unique simple noun phrases. CLARIT and the Xerox Tagger exactly identified 77.0% and 68.7% of the phrases, respectively, and partially identified 85.7% and 80.8% of the phrases. Neither system had been specially modified or tuned to the medical domain. These results suggest that it is possible to apply existing natural language processing (NLP) techniques to large bodies of medical text, in order to empirically identify the terminology used in medicine. Virtually all the noun phrases could be regarded as having special medical connotation and would be candidates for entry into a controlled medical vocabulary.
我们评估了两种自然语言解析器CLARIT和施乐标记器在医疗出院小结中识别简单名词短语的能力。在随机选取的20份出院小结中,共有1909个独特的简单名词短语。CLARIT和施乐标记器分别准确识别了77.0%和68.7%的短语,部分识别了85.7%和80.8%的短语。这两个系统均未针对医学领域进行专门修改或调整。这些结果表明,有可能将现有的自然语言处理(NLP)技术应用于大量医学文本,以便凭经验识别医学中使用的术语。几乎所有的名词短语都可被视为具有特殊医学内涵,并且有可能被纳入受控医学词汇表。