Friedman Carol, Liu Hongfang, Shagina Lyudmila
Department of Medical Informatics, Columbia University, 622 West 168 Street, VC-5 Bldg, New York, NY 10032, USA.
J Biomed Inform. 2003 Jun;36(3):189-201. doi: 10.1016/j.jbi.2003.08.005.
Medical terminologies are critical for automated healthcare systems. Some terminologies, such as the UMLS and SNOMED are comprehensive, whereas others specialize in limited domains (i.e., BIRADS) or are developed for specific applications. An important feature of a terminology is comprehensive coverage of relevant clinical terms and ease of use by users, which include computerized applications. We have developed a method for facilitating vocabulary development and maintenance that is based on utilization of natural language processing to mine large collections of clinical reports in order to obtain information on terminology as expressed by physicians. Once the reports are processed and the terms structured and collected into an XML representational schema, it is possible to determine information about terms, such as frequency of occurrence, compositionality, relations to other terms (such as modifiers), and correspondence to a controlled vocabulary. This paper describes the method and discusses how it can be used as a tool to help vocabulary builders navigate through the terms physicians use, visualize their relations to other terms via a flexible viewer, and determine their correspondence to a controlled vocabulary.
医学术语对于自动化医疗系统至关重要。一些术语,如统一医学语言系统(UMLS)和医学系统命名法(SNOMED),涵盖范围广泛,而其他一些则专注于有限的领域(如乳腺影像报告和数据系统(BIRADS))或为特定应用而开发。术语的一个重要特征是对相关临床术语的全面覆盖以及用户(包括计算机化应用程序)使用的便利性。我们开发了一种促进词汇表开发和维护的方法,该方法基于利用自然语言处理技术挖掘大量临床报告,以便获取医生所表达的术语信息。一旦对报告进行处理,将术语结构化并收集到XML表示模式中,就可以确定有关术语的信息,例如出现频率、组成性、与其他术语(如修饰词)的关系以及与受控词汇表的对应关系。本文描述了该方法,并讨论了如何将其用作工具,以帮助词汇表构建者梳理医生使用的术语,通过灵活的查看器可视化它们与其他术语的关系,并确定它们与受控词汇表的对应关系。