基于自然语言处理的电子健康记录中阿尔茨海默病及相关痴呆症社会决定因素的识别。

Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.

机构信息

Departments of Population Health and Medicine, Grossman School of Medicine, New York University, New York City, New York, USA.

Center for Data Science, New York University, New York City, New York, USA.

出版信息

Health Serv Res. 2023 Dec;58(6):1292-1302. doi: 10.1111/1475-6773.14210. Epub 2023 Aug 3.

DOI:10.1111/1475-6773.14210

PMID:37534741

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10622277/

Abstract

OBJECTIVE

To develop a natural language processing (NLP) algorithm that identifies social determinants of health (SDoH), including housing, transportation, food, and medication insecurities, social isolation, abuse, neglect, or exploitation, and financial difficulties for patients with Alzheimer's disease and related dementias (ADRD) from unstructured electronic health records (EHRs).

DATA SOURCES AND STUDY SETTING

We leveraged 1000 medical notes randomly selected from 7401 emergency department and inpatient social worker notes generated between 2015 and 2019 for 231 unique patients diagnosed with ADRD at Michigan Medicine.

STUDY DESIGN

We developed a rule-based NLP algorithm for the identification of seven domains of SDoH noted above. We also compared the rule-based algorithm with deep learning and regularized logistic regression approaches. These models were compared using accuracy, sensitivity, specificity, F1 score, and the area under the receiver operating characteristic curve (AUC). All notes were split into 700 notes for training NLP algorithms, and 300 notes for validation.

DATA COLLECTION/EXTRACTION METHODS: Social worker notes used in this study were extracted from the Michigan Medicine EHR database.

PRINCIPAL FINDINGS

Of the 700 notes for training, F1 and AUC for the rule-based algorithm were at least 0.94 and 0.95, respectively, for all SDoH categories. Of the 300 notes for validation, F1 and AUC were at least 0.80 and 0.97, respectively, for all SDoH except housing and medication insecurities. The deep learning and regularized logistic regression algorithms had unsatisfactory performance.

CONCLUSIONS

The rule-based algorithm can accurately extract SDoH information in all seven domains of SDoH except housing and medication insecurities. Findings from the algorithm can be used by clinicians and social workers to proactively address social needs of patients with ADRD and other vulnerable patient populations.

摘要

目的

开发一种自然语言处理（NLP）算法，以从非结构化电子健康记录（EHR）中识别出患有阿尔茨海默病及相关痴呆症（ADRD）的患者的健康社会决定因素（SDoH），包括住房、交通、食物和药物不安全、社会孤立、虐待、忽视或剥削以及经济困难。

数据来源和研究设置

我们利用了密歇根大学医学中心在 2015 年至 2019 年间生成的 7401 份急诊和住院社工记录中随机抽取的 1000 份医疗记录，这些记录来自 231 位确诊为 ADRD 的患者。

研究设计

我们为上述七个 SDoH 领域的识别开发了一种基于规则的 NLP 算法。我们还将基于规则的算法与深度学习和正则逻辑回归方法进行了比较。使用准确性、敏感性、特异性、F1 评分和接收器工作特征曲线（ROC）下的面积（AUC）来比较这些模型。所有记录都被分为 700 份用于训练 NLP 算法，300 份用于验证。

数据收集/提取方法：本研究中使用的社工记录从密歇根大学医学 EHR 数据库中提取。

主要发现

在 700 份用于训练的记录中，基于规则的算法对于所有 SDoH 类别的 F1 和 AUC 分别至少为 0.94 和 0.95。在 300 份用于验证的记录中，除住房和药物不安全外，F1 和 AUC 分别至少为 0.80 和 0.97。深度学习和正则逻辑回归算法的性能并不理想。

结论

基于规则的算法可以准确地提取除住房和药物不安全外的所有七个 SDoH 领域的 SDoH 信息。该算法的结果可被临床医生和社工用于主动解决 ADRD 患者和其他弱势患者群体的社会需求。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7c1/10622277/3d4925fe4b8b/HESR-58-1292-g002.jpg

相似文献

Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.基于自然语言处理的电子健康记录中阿尔茨海默病及相关痴呆症社会决定因素的识别。

Health Serv Res. 2023 Dec;58(6):1292-1302. doi: 10.1111/1475-6773.14210. Epub 2023 Aug 3.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Evaluation of a Natural Language Processing Approach to Identify Social Determinants of Health in Electronic Health Records in a Diverse Community Cohort.评估一种自然语言处理方法，以识别不同人群队列电子健康记录中的健康社会决定因素。

Med Care. 2022 Mar 1;60(3):248-255. doi: 10.1097/MLR.0000000000001683.

Extracting Critical Information from Unstructured Clinicians' Notes Data to Identify Dementia Severity Using a Rule-Based Approach: Feasibility Study.基于规则的方法从非结构化临床医生笔记数据中提取关键信息以识别痴呆严重程度的可行性研究。

JMIR Aging. 2024 Sep 24;7:e57926. doi: 10.2196/57926.

Social Determinants of Health in EMS Records: A Mixed-methods Analysis Using Natural Language Processing and Qualitative Content Analysis.医疗急救记录中的健康社会决定因素：使用自然语言处理和定性内容分析的混合方法分析。

West J Emerg Med. 2023 Sep;24(5):878-887. doi: 10.5811/westjem.59070.

Extracting social determinants of health from electronic health records using natural language processing: a systematic review.利用自然语言处理从电子健康记录中提取健康的社会决定因素：系统评价。

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。

J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.

Automated phenotyping of mild cognitive impairment and Alzheimer's disease and related dementias using electronic health records.利用电子健康记录对轻度认知障碍、阿尔茨海默病及相关痴呆症进行自动表型分析。

Int J Med Inform. 2025 Aug;200:105917. doi: 10.1016/j.ijmedinf.2025.105917. Epub 2025 Apr 11.

Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.利用自然语言处理技术增强电子健康记录中的结构化社会决定因素健康数据。

J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073.

Scalable information extraction from free text electronic health records using large language models.使用大语言模型从自由文本电子健康记录中进行可扩展的信息提取。

BMC Med Res Methodol. 2025 Jan 28;25(1):23. doi: 10.1186/s12874-025-02470-z.

引用本文的文献

Identifying Transportation Needs in Ophthalmology Clinic Notes Using Natural Language Processing: Retrospective, Cross-Sectional Study.使用自然语言处理识别眼科临床记录中的交通需求：回顾性横断面研究。

JMIR Med Inform. 2025 Sep 5;13:e69216. doi: 10.2196/69216.

Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs.利用基于大语言模型增强的文献挖掘和知识图谱，在阿尔茨海默病研究中利用健康的社会决定因素

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:491-500. eCollection 2025.

Development of a natural language processing algorithm to extract social determinants of health from clinician notes.开发一种自然语言处理算法，以从临床医生记录中提取健康的社会决定因素。

Am J Transplant. 2025 Jun;25(6):1306-1318. doi: 10.1016/j.ajt.2025.02.019. Epub 2025 Mar 6.

SBDH-Reader: an LLM-powered method for extracting social and behavioral determinants of health from medical notes.SBDH阅读器：一种由大型语言模型驱动的从医疗记录中提取健康的社会和行为决定因素的方法。

medRxiv. 2025 Feb 21:2025.02.19.25322576. doi: 10.1101/2025.02.19.25322576.

Natural language processing in Alzheimer's disease research: Systematic review of methods, data, and efficacy.阿尔茨海默病研究中的自然语言处理：方法、数据和疗效的系统综述

Alzheimers Dement (Amst). 2025 Feb 11;17(1):e70082. doi: 10.1002/dad2.70082. eCollection 2025 Jan-Mar.

Extracting Housing and Food Insecurity Information From Clinical Notes Using cTAKES.使用cTAKES从临床记录中提取住房和粮食不安全信息。

Health Serv Res. 2025 May;60 Suppl 3(Suppl 3):e14440. doi: 10.1111/1475-6773.14440. Epub 2025 Jan 28.

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: Protocol for a Systematic Review.自然语言处理和大语言模型在健康社会决定因素中的应用：系统评价方案

JMIR Res Protoc. 2025 Jan 21;14:e66094. doi: 10.2196/66094.

Using large language models for extracting stressful life events to assess their impact on preventive colon cancer screening adherence.使用大语言模型提取应激性生活事件以评估其对预防性结肠癌筛查依从性的影响。

BMC Public Health. 2025 Jan 2;25(1):12. doi: 10.1186/s12889-024-21123-2.

JMIR Aging. 2024 Sep 24;7:e57926. doi: 10.2196/57926.

On the development and validation of large language model-based classifiers for identifying social determinants of health.基于大语言模型的健康社会决定因素识别分类器的开发与验证

Proc Natl Acad Sci U S A. 2024 Sep 24;121(39):e2320716121. doi: 10.1073/pnas.2320716121. Epub 2024 Sep 16.

本文引用的文献

Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review.人工智能与电子健康记录时代健康的社会和行为决定因素：一项范围综述

Health Data Sci. 2021 Aug 24;2021:9759016. doi: 10.34133/2021/9759016. eCollection 2021.

Artificial Intelligence and Machine Learning in Clinical Medicine, 2023.临床医学中的人工智能与机器学习，2023年。

N Engl J Med. 2023 Mar 30;388(13):1201-1208. doi: 10.1056/NEJMra2302038.

Association of social determinants of health with frailty, cognitive impairment, and self-rated health among older adults.社会健康决定因素与老年人虚弱、认知障碍和自我健康评估之间的关联。

PLoS One. 2022 Nov 11;17(11):e0277290. doi: 10.1371/journal.pone.0277290. eCollection 2022.

Identifying individual social risk factors using unstructured data in electronic health records and their relationship with adverse clinical outcomes.利用电子健康记录中的非结构化数据识别个体社会风险因素及其与不良临床结局的关系。

SSM Popul Health. 2022 Aug 30;19:101210. doi: 10.1016/j.ssmph.2022.101210. eCollection 2022 Sep.

Identifying Caregiver Availability Using Medical Notes With Rule-Based Natural Language Processing: Retrospective Cohort Study.使用基于规则的自然语言处理技术通过医学记录识别照顾者可及性：回顾性队列研究

JMIR Aging. 2022 Sep 22;5(3):e40241. doi: 10.2196/40241.

Assessing the Documentation of Social Determinants of Health for Lung Cancer Patients in Clinical Narratives.评估临床病历中肺癌患者健康社会决定因素的记录情况。

Front Public Health. 2022 Mar 28;10:778463. doi: 10.3389/fpubh.2022.778463. eCollection 2022.

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

Med Care. 2022 Mar 1;60(3):248-255. doi: 10.1097/MLR.0000000000001683.

J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

Documentation and review of social determinants of health data in the EHR: measures and associated insights.电子健康记录中健康的社会决定因素数据的文档记录和审查：措施和相关见解。

J Am Med Inform Assoc. 2021 Nov 25;28(12):2608-2616. doi: 10.1093/jamia/ocab194.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于自然语言处理的电子健康记录中阿尔茨海默病及相关痴呆症社会决定因素的识别。

Natural language processing to identify social determinants of health in Alzheimer's disease and related dementia from electronic health records.

机构信息

出版信息

OBJECTIVE

DATA SOURCES AND STUDY SETTING

STUDY DESIGN

PRINCIPAL FINDINGS

CONCLUSIONS

目的

数据来源和研究设置

研究设计

主要发现

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献