应用自然语言处理从患者病历中识别社会需求：在综合医疗服务系统中开发和评估一个可扩展、高性能且基于规则的模型。

Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system.

作者信息

Gray Geoffrey M, Zirikly Ayah, Ahumada Luis M, Rouhizadeh Masoud, Richards Thomas, Kitchen Christopher, Foroughmand Iman, Hatef Elham

机构信息

Center for Pediatric Data Science and Analytic Methodology, Johns Hopkins All Children's Hospital, St. Petersburg, FL, United States.

Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, United States.

出版信息

JAMIA Open. 2023 Oct 4;6(4):ooad085. doi: 10.1093/jamiaopen/ooad085. eCollection 2023 Dec.

DOI:10.1093/jamiaopen/ooad085

PMID:37799347

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10550267/

Abstract

OBJECTIVES

To develop and test a scalable, performant, and rule-based model for identifying 3 major domains of social needs (residential instability, food insecurity, and transportation issues) from the unstructured data in electronic health records (EHRs).

MATERIALS AND METHODS

We included patients aged 18 years or older who received care at the Johns Hopkins Health System (JHHS) between July 2016 and June 2021 and had at least 1 unstructured (free-text) note in their EHR during the study period. We used a combination of manual lexicon curation and semiautomated lexicon creation for feature development. We developed an initial rules-based pipeline (Match Pipeline) using 2 keyword sets for each social needs domain. We performed rule-based keyword matching for distinct lexicons and tested the algorithm using an annotated dataset comprising 192 patients. Starting with a set of expert-identified keywords, we tested the adjustments by evaluating false positives and negatives identified in the labeled dataset. We assessed the performance of the algorithm using measures of precision, recall, and 1 score.

RESULTS

The algorithm for identifying residential instability had the best overall performance, with a weighted average for precision, recall, and 1 score of 0.92, 0.84, and 0.92 for identifying patients with homelessness and 0.84, 0.82, and 0.79 for identifying patients with housing insecurity. Metrics for the food insecurity algorithm were high but the transportation issues algorithm was the lowest overall performing metric.

DISCUSSION

The NLP algorithm in identifying social needs at JHHS performed relatively well and would provide the opportunity for implementation in a healthcare system.

CONCLUSION

The NLP approach developed in this project could be adapted and potentially operationalized in the routine data processes of a healthcare system.

摘要

目标

开发并测试一种可扩展、高性能且基于规则的模型，用于从电子健康记录（EHR）中的非结构化数据识别社会需求的3个主要领域（居住不稳定、粮食不安全和交通问题）。

材料与方法

我们纳入了2016年7月至2021年6月期间在约翰霍普金斯医疗系统（JHHS）接受治疗且年龄在18岁及以上、在研究期间其EHR中至少有1条非结构化（自由文本）记录的患者。我们使用手动词汇编纂和半自动词汇创建相结合的方法进行特征开发。我们针对每个社会需求领域使用2个关键词集开发了一个初始的基于规则的流程（匹配流程）。我们对不同的词汇进行基于规则的关键词匹配，并使用包含192名患者的注释数据集测试该算法。从一组专家确定的关键词开始，我们通过评估在标记数据集中识别出的假阳性和假阴性来测试调整情况。我们使用精确率、召回率和F1分数来评估算法的性能。

结果

识别居住不稳定的算法总体性能最佳，识别无家可归患者时精确率、召回率和F1分数的加权平均值分别为0.92、0.84和0.92，识别住房不安全患者时分别为0.84、0.82和0.79。粮食不安全算法的指标较高，但交通问题算法的总体性能指标最低。

讨论

JHHS中用于识别社会需求的自然语言处理算法表现相对较好，将为在医疗系统中实施提供机会。

结论

本项目开发的自然语言处理方法可在医疗系统的常规数据流程中进行调整并可能投入使用。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

应用自然语言处理从患者病历中识别社会需求：在综合医疗服务系统中开发和评估一个可扩展、高性能且基于规则的模型。

Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目标

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

应用自然语言处理从患者病历中识别社会需求：在综合医疗服务系统中开发和评估一个可扩展、高性能且基于规则的模型。

Application of natural language processing to identify social needs from patient medical notes: development and assessment of a scalable, performant, and rule-based model in an integrated healthcare delivery system.

作者信息

机构信息

出版信息

OBJECTIVES

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSION

目标

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献