Soysal Ergin, Wang Jingqi, Jiang Min, Wu Yonghui, Pakhomov Serguei, Liu Hongfang, Xu Hua
School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA.
Department of Pharmaceutical Care and Health System, University of Minnesota Twin Cities, Minneapolis, MN, USA.
J Am Med Inform Assoc. 2018 Mar 1;25(3):331-336. doi: 10.1093/jamia/ocx132.
Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.
现有的通用临床自然语言处理(NLP)系统,如MetaMap和临床文本分析与知识提取系统,已成功应用于从临床文本中提取信息。然而,终端用户通常必须针对其个人任务定制现有系统,这可能需要大量的NLP技能。在此,我们展示了CLAMP(临床语言注释、建模和处理),这是一个新开发的临床NLP工具包,它不仅提供了最先进的NLP组件,还提供了一个用户友好的图形用户界面,可帮助用户快速为其个人应用构建定制的NLP管道。我们的评估表明,CLAMP默认管道在命名实体识别和概念编码方面取得了良好的性能。我们还通过两个用例展示了CLAMP图形用户界面在构建定制的高性能NLP管道方面的效率,这两个用例分别是提取吸烟状态和实验室检查值。CLAMP可供公开研究使用,我们相信它是临床NLP社区的一项独特资产。