Suppr超能文献

评估一种自然语言处理方法,以识别不同人群队列电子健康记录中的健康社会决定因素。

Evaluation of a Natural Language Processing Approach to Identify Social Determinants of Health in Electronic Health Records in a Diverse Community Cohort.

机构信息

University of Illinois at Urbana-Champaign Carle Illinois College of Medicine, Champaign, IL.

Mid-Atlantic Permanente Medical Group PC, Kaiser Permanente Mid-Atlantic States, Rockville, MD.

出版信息

Med Care. 2022 Mar 1;60(3):248-255. doi: 10.1097/MLR.0000000000001683.

Abstract

BACKGROUND

Health care systems in the United States are increasingly interested in measuring and addressing social determinants of health (SDoH). Advances in electronic health record systems and Natural Language Processing (NLP) create a unique opportunity to systematically document patient SDoH from digitized free-text provider notes.

METHODS

Patient SDoH status [recorded by Your Current Life Situation (YCLS) Survey] and associated provider notes recorded between March 2017 and June 2020 were extracted (32,261 beneficiaries; 50,722 YCLS surveys; 485,425 provider notes).NLP patterns were generated using a machine learning test statistic (Term Frequency-Inverse Document Frequency). Patterns were developed and assessed in a training, training validation, and final validation dataset (64%, 16%, and 20% of total data, respectively).NLP models analyzed SDoH-specific categories (housing, medical care, and transportation needs) and a combined SDoH metric. Model performance was assessed using sensitivity, specificity, and Cohen κ statistic, assuming the YCLS Survey to be the gold standard.

RESULTS

Within the training validation dataset, NLP models showed strong sensitivity and specificity, with moderate agreement with the YCLS Survey (Housing: sensitivity=0.67, specificity=0.89, κ=0.51; Medical care: sensitivity=0.55, specificity=0.73, κ=0.20; Transportation: sensitivity=0.79, specificity=0.87, κ=0.58). Model performance in the training and training validation datasets were comparable.In the final validation dataset, a combined SDoH prediction metric showed sensitivity=0.77, specificity=0.69, κ=0.45.

CONCLUSION

This NLP algorithm demonstrated moderate performance in identification of unmet patient social needs. This novel approach may enable improved targeting of interventions, allocation of limited resources and monitoring a health care system's addressing its patients' SDoH needs.

摘要

背景

美国的医疗保健系统越来越关注测量和解决健康的社会决定因素(SDoH)。电子健康记录系统和自然语言处理(NLP)的进步为从数字化的自由文本提供者记录中系统地记录患者的 SDoH 提供了独特的机会。

方法

提取了 2017 年 3 月至 2020 年 6 月期间记录的患者 SDoH 状态(通过当前生活状况(YCLS)调查记录)和相关的提供者记录(32261 名受益人;50722 份 YCLS 调查;485425 份提供者记录)。使用机器学习测试统计量(词频-逆文档频率)生成 NLP 模式。在培训、培训验证和最终验证数据集(分别占总数据的 64%、16%和 20%)中开发和评估模式。NLP 模型分析了特定于 SDoH 的类别(住房、医疗保健和交通需求)和综合 SDoH 指标。使用敏感性、特异性和 Cohen κ 统计量评估模型性能,假设 YCLS 调查是金标准。

结果

在培训验证数据集中,NLP 模型表现出较强的敏感性和特异性,与 YCLS 调查有中度一致性(住房:敏感性=0.67,特异性=0.89,κ=0.51;医疗保健:敏感性=0.55,特异性=0.73,κ=0.20;交通:敏感性=0.79,特异性=0.87,κ=0.58)。培训和培训验证数据集的模型性能相当。在最终验证数据集中,综合 SDoH 预测指标的敏感性为 0.77,特异性为 0.69,κ=0.45。

结论

该 NLP 算法在识别未满足的患者社会需求方面表现出中等性能。这种新方法可以实现干预措施的更精准定位、有限资源的合理分配以及监测医疗保健系统满足患者 SDoH 需求的情况。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验