CCMapper：一种基于自适应 NLP 的自由文本主诉映射算法。

CCMapper: An adaptive NLP-based free-text chief complaint mapping algorithm.

机构信息

Department of Health Informatics and Data Science, Loyola University Chicago, Maywood, IL, USA; Center for Health Outcomes and Informatics Research, Loyola University Chicago, Maywood, IL, USA.

Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA; Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, MN, USA.

出版信息

Comput Biol Med. 2019 Oct;113:103398. doi: 10.1016/j.compbiomed.2019.103398. Epub 2019 Aug 21.

DOI:10.1016/j.compbiomed.2019.103398

PMID:31454613

Abstract

OBJECTIVE

Chief complaint (CC) is among the earliest health information recorded at the beginning of a patient's visit to an emergency department (ED). We propose a heuristic methodology for automatically mapping the free-text data into a structured list of CCs.

METHODS

A comprehensive structured list categorizing CCs was developed by experienced Emergency Medicine (EM) physicians. Using this list, we developed a natural language processing-based algorithm, referred to as Chief Complaint Mapper (CCMapper), for automatically mapping a CC into the most appropriate category (ies). We trained and validated CCMapper using free-text CC data from the Mayo Clinic ED in Rochester, MN. We developed a consensus-based validation approach to handle both indifferences and disagreements between the two EM physicians who manually mapped a random sample of free-text CCs into categories within the structured list.

RESULTS

The kappa statistic demonstrated a high level of agreement (κ = 0.958) between the two physicians with less than 2% human error. CCMapper achieved a total sensitivity of 94.2% with a specificity of 99.8% and F-score of 94.7% on the validation set. The sensitivity of CCMapper when mapping free-text data with multiple CCs was 82.3% with a specificity of 99.1% and total F-score of 82.3%.

CONCLUSION

Due to its simplicity, high performance, and capability of incorporating new free-text CC data, CCMapper can be readily adopted by other EDs to support clinical decision making. CCMapper can facilitate the development of predictive models for the type and timing of important events in ED (e.g., ICU admission).

摘要

目的

主诉（CC）是患者在急诊科就诊时最早记录的健康信息之一。我们提出了一种启发式方法，将自由文本数据自动映射到结构化的 CC 列表中。

方法

由经验丰富的急诊医学（EM）医生开发了一个全面的结构化列表，对 CC 进行分类。使用这个列表，我们开发了一种基于自然语言处理的算法，称为主要投诉映射器（CCMapper），用于将 CC 自动映射到最合适的类别（多个类别）。我们使用明尼苏达州罗切斯特市梅奥诊所 ED 的自由文本 CC 数据对 CCMapper 进行了训练和验证。我们开发了一种基于共识的验证方法，用于处理两位手动将随机样本的自由文本 CC 映射到结构化列表中的类别中的 EM 医生之间的差异和分歧。

结果

kappa 统计显示两位医生之间具有高度一致性（κ=0.958），人为错误率低于 2%。CCMapper 在验证集上的总灵敏度为 94.2%，特异性为 99.8%，F1 得分为 94.7%。当映射具有多个 CC 的自由文本数据时，CCMapper 的灵敏度为 82.3%，特异性为 99.1%，总 F1 得分为 82.3%。