Luo Yuan, Szolovits Peter
Dept. of Preventive Medicine, Northwestern University, Chicago, USA.
CSAIL, MIT, Cambridge, USA.
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2018 Dec;2018:461-466. doi: 10.1109/bibm.2018.8621521. Epub 2019 Jan 24.
This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with in-line annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.
本文介绍了一种用于处理临床笔记的便携式自然语言处理(NLP)系统的Lisp架构,称为LAPNLP。LAPNLP集成了多个标准、定制和内部开发的NLP工具。我们的系统通过纳入丰富的通用数据模型(CDM)来标准化必要的数据元素,从而促进跨不同机构和数据系统的可移植性。在集成通用领域NLP工具时,它利用统一医学语言系统(UMLS)进行领域适配。它还具有通过对原始文档的位置引用指定的间接注释。我们构建了一个基于区间树的搜索引擎,通过指定位置要求来高效地查询和检索间接注释。我们还开发了一个实用程序,将内联注释格式转换为间接注释,以实现对带有内联注释的临床文本数据集的重用。我们在几个NLP辅助任务上对我们的系统进行了实验,包括淋巴瘤患者的计算表型分析和临床笔记的语义关系提取。这些实验展示了LAPNLP更广泛的适用性和实用性。