Shi Jianlin, Liu Siru, Pruitt Liese C C, Luppens Carolyn L, Ferraro Jeffrey P, Gundlapalli Adi V, Chapman Wendy W, Bucher Brian T
School of Medicine, University of Utah, Salt Lake City, Utah, US.
Intermountain Healthcare, Salt Lake City, Utah, US.
AMIA Annu Symp Proc. 2020 Mar 4;2019:794-803. eCollection 2019.
Surgical Site Infection surveillance in healthcare systems is labor intensive and plagued by underreporting as current methodology relies heavily on manual chart review. The rapid adoption of electronic health records (EHRs) has the potential to allow the secondary use of EHR data for quality surveillance programs. This study aims to investigate the effectiveness of integrating natural language processing (NLP) outputs with structured EHR data to build machine learning models for SSI identification using real-world clinical data. We examined a set of models using structured data with and without NLP document-level, mention-level, and keyword features. The top-performing model was based on a Random Forest classifier enhanced with NLP document-level features achieving a 0.58 sensitivity, 0.97 specificity, 0.54 PPV, 0.98 NPV, and 0.52 F score. We further interrogated the feature contributions, analyzed the errors, and discussed future directions.
医疗系统中的手术部位感染监测工作劳动强度大,且由于当前方法严重依赖人工病历审查,存在报告不足的问题。电子健康记录(EHR)的迅速采用有可能使EHR数据用于质量监测项目的二次利用。本研究旨在调查将自然语言处理(NLP)输出与结构化EHR数据相结合,以利用真实世界临床数据构建用于识别手术部位感染的机器学习模型的有效性。我们使用了一组包含和不包含NLP文档级、提及级和关键词特征的结构化数据的模型进行研究。表现最佳的模型基于一个通过NLP文档级特征增强的随机森林分类器,其灵敏度为0.58,特异度为0.97,阳性预测值为0.54,阴性预测值为0.98,F值为0.52。我们进一步探究了特征贡献,分析了误差,并讨论了未来的方向。