评估一种自然语言处理方法，以识别不同人群队列电子健康记录中的健康社会决定因素。

Evaluation of a Natural Language Processing Approach to Identify Social Determinants of Health in Electronic Health Records in a Diverse Community Cohort.

机构信息

University of Illinois at Urbana-Champaign Carle Illinois College of Medicine, Champaign, IL.

Mid-Atlantic Permanente Medical Group PC, Kaiser Permanente Mid-Atlantic States, Rockville, MD.

出版信息

Med Care. 2022 Mar 1;60(3):248-255. doi: 10.1097/MLR.0000000000001683.

DOI:10.1097/MLR.0000000000001683

PMID:34984989

Abstract

BACKGROUND

Health care systems in the United States are increasingly interested in measuring and addressing social determinants of health (SDoH). Advances in electronic health record systems and Natural Language Processing (NLP) create a unique opportunity to systematically document patient SDoH from digitized free-text provider notes.

METHODS

Patient SDoH status [recorded by Your Current Life Situation (YCLS) Survey] and associated provider notes recorded between March 2017 and June 2020 were extracted (32,261 beneficiaries; 50,722 YCLS surveys; 485,425 provider notes).NLP patterns were generated using a machine learning test statistic (Term Frequency-Inverse Document Frequency). Patterns were developed and assessed in a training, training validation, and final validation dataset (64%, 16%, and 20% of total data, respectively).NLP models analyzed SDoH-specific categories (housing, medical care, and transportation needs) and a combined SDoH metric. Model performance was assessed using sensitivity, specificity, and Cohen κ statistic, assuming the YCLS Survey to be the gold standard.

RESULTS

Within the training validation dataset, NLP models showed strong sensitivity and specificity, with moderate agreement with the YCLS Survey (Housing: sensitivity=0.67, specificity=0.89, κ=0.51; Medical care: sensitivity=0.55, specificity=0.73, κ=0.20; Transportation: sensitivity=0.79, specificity=0.87, κ=0.58). Model performance in the training and training validation datasets were comparable.In the final validation dataset, a combined SDoH prediction metric showed sensitivity=0.77, specificity=0.69, κ=0.45.

CONCLUSION

This NLP algorithm demonstrated moderate performance in identification of unmet patient social needs. This novel approach may enable improved targeting of interventions, allocation of limited resources and monitoring a health care system's addressing its patients' SDoH needs.

摘要

背景

美国的医疗保健系统越来越关注测量和解决健康的社会决定因素（SDoH）。电子健康记录系统和自然语言处理（NLP）的进步为从数字化的自由文本提供者记录中系统地记录患者的 SDoH 提供了独特的机会。

方法

提取了 2017 年 3 月至 2020 年 6 月期间记录的患者 SDoH 状态（通过当前生活状况（YCLS）调查记录）和相关的提供者记录（32261 名受益人；50722 份 YCLS 调查；485425 份提供者记录）。使用机器学习测试统计量（词频-逆文档频率）生成 NLP 模式。在培训、培训验证和最终验证数据集（分别占总数据的 64%、16%和 20%）中开发和评估模式。NLP 模型分析了特定于 SDoH 的类别（住房、医疗保健和交通需求）和综合 SDoH 指标。使用敏感性、特异性和 Cohen κ 统计量评估模型性能，假设 YCLS 调查是金标准。

结果

在培训验证数据集中，NLP 模型表现出较强的敏感性和特异性，与 YCLS 调查有中度一致性（住房：敏感性=0.67，特异性=0.89，κ=0.51；医疗保健：敏感性=0.55，特异性=0.73，κ=0.20；交通：敏感性=0.79，特异性=0.87，κ=0.58）。培训和培训验证数据集的模型性能相当。在最终验证数据集中，综合 SDoH 预测指标的敏感性为 0.77，特异性为 0.69，κ=0.45。

结论

该 NLP 算法在识别未满足的患者社会需求方面表现出中等性能。这种新方法可以实现干预措施的更精准定位、有限资源的合理分配以及监测医疗保健系统满足患者 SDoH 需求的情况。

相似文献

Evaluation of a Natural Language Processing Approach to Identify Social Determinants of Health in Electronic Health Records in a Diverse Community Cohort.

Med Care. 2022 Mar 1;60(3):248-255. doi: 10.1097/MLR.0000000000001683.

Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.

Health Serv Res. 2023 Dec;58(6):1292-1302. doi: 10.1111/1475-6773.14210. Epub 2023 Aug 3.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Measuring the Value of a Practical Text Mining Approach to Identify Patients With Housing Issues in the Free-Text Notes in Electronic Health Record: Findings of a Retrospective Cohort Study.

Front Public Health. 2021 Aug 27;9:697501. doi: 10.3389/fpubh.2021.697501. eCollection 2021.

Social Determinants of Health in EMS Records: A Mixed-methods Analysis Using Natural Language Processing and Qualitative Content Analysis.

West J Emerg Med. 2023 Sep;24(5):878-887. doi: 10.5811/westjem.59070.

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

Using Natural Language Processing to Examine Social Determinants of Health in Prehospital Pediatric Encounters and Associations with EMS Transport Decisions.

Prehosp Emerg Care. 2023;27(2):246-251. doi: 10.1080/10903127.2022.2072984. Epub 2022 May 23.

Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.

J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073.

Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias.

J Biomed Inform. 2024 May;153:104642. doi: 10.1016/j.jbi.2024.104642. Epub 2024 Apr 14.

Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.

J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046.

引用本文的文献

Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.

Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: Protocol for a Systematic Review.

JMIR Res Protoc. 2025 Jan 21;14:e66094. doi: 10.2196/66094.

Natural Language Processing and Social Determinants of Health in Mental Health Research: AI-Assisted Scoping Review.

JMIR Ment Health. 2025 Jan 16;12:e67192. doi: 10.2196/67192.

Realizing the potential of social determinants data in EHR systems: A scoping review of approaches for screening, linkage, extraction, analysis, and interventions.

J Clin Transl Sci. 2024 Oct 10;8(1):e147. doi: 10.1017/cts.2024.571. eCollection 2024.

Identifying Veterans with a Higher Risk of Social Needs Using Cluster Analysis.

J Gen Intern Med. 2025 Feb;40(2):385-392. doi: 10.1007/s11606-024-08862-z. Epub 2024 Oct 7.

Classifying Individuals With Rheumatic Conditions as Financially Insecure Using Electronic Health Record Data and Natural Language Processing: Algorithm Derivation and Validation.

ACR Open Rheumatol. 2024 Aug;6(8):481-488. doi: 10.1002/acr2.11675. Epub 2024 May 15.

Development of a Social Risk Score in the Electronic Health Record to Identify Social Needs Among Underserved Populations: Retrospective Study.

JMIR Form Res. 2024 Mar 12;8:e54732. doi: 10.2196/54732.

Structured and unstructured social risk factor documentation in the electronic health record underestimates patients' self-reported risks.

J Am Med Inform Assoc. 2024 Feb 16;31(3):714-719. doi: 10.1093/jamia/ocad261.

Large language models to identify social determinants of health in electronic health records.

NPJ Digit Med. 2024 Jan 11;7(1):6. doi: 10.1038/s41746-023-00970-0.

Year 2022 in Medical Natural Language Processing: Availability of Language Models as a Step in the Democratization of NLP in the Biomedical Area.

Yearb Med Inform. 2023 Aug;32(1):244-252. doi: 10.1055/s-0043-1768752. Epub 2023 Dec 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

评估一种自然语言处理方法，以识别不同人群队列电子健康记录中的健康社会决定因素。

Evaluation of a Natural Language Processing Approach to Identify Social Determinants of Health in Electronic Health Records in a Diverse Community Cohort.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献