• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自然语言处理技术从电子健康记录中提取健康的社会决定因素

Extraction of Social Determinants of Health From Electronic Health Records Using Natural Language Processing.

作者信息

Chen Zhenghua, Lasserre Patricia, Lin Angela, Rajapakshe Rasika

机构信息

BC Cancer Kelowna, Kelowna, Canada.

Computer Science, University of British Columbia-Okanagan, Kelowna, Canada.

出版信息

JCO Clin Cancer Inform. 2025 Jul;9:e2400317. doi: 10.1200/CCI-24-00317. Epub 2025 Jul 23.

DOI:10.1200/CCI-24-00317
PMID:40700678
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12309507/
Abstract

PURPOSE

Social Determinants of Health (SDoH) have a significant effect on health outcomes and inequalities. SDoH can be extracted from electronic health records (EHR) to aid policy development and research to improve population health. Automated extraction using artificial intelligence (AI) can improve efficiency and cost-effectiveness. The focus of this study was to autonomously extract comprehensive SDoH details from EHR using a natural language processing (NLP)-based AI pipeline.

MATERIALS AND METHODS

A curated set of 1,000 BC Cancer clinical documents with concentrated SDoH information served as the reference standard for training and evaluating NLP models. Two pipelines were used: an open-source pipeline trained on the annotated medical documents and an industrial pretrained solution used as a benchmark. Three experiments optimized the first pipeline's performance, assessing the effect of including subtype word positions during training. The superior open-source pipeline was then used to extract SDoH information from 13,258 oncology documents.

RESULTS

The open-source pipeline achieved an average F1 score accuracy of 0.88 on the validation data set for extracting 13 SDoH factors, surpassing the benchmark by 5%. It excelled in detailed subtype extraction, while the benchmark performed better in identifying rarely annotated SDoH information in BC Cancer data set. Overall, 60,717 SDoH factors and associated details were extracted from BC Cancer EHR oncology documents. The most frequently extracted SDoH factors included tobacco use, employment status, marital status, alcohol consumption, and living status, occurring between 8k to 12k times.

CONCLUSION

This study demonstrates the potential of an NLP pipeline to extract SDoH factors from clinical notes, with strong performance on limited data, although data set-specific adjustments are needed for broader application across institutions.

摘要

目的

健康的社会决定因素(SDoH)对健康结果和不平等现象有重大影响。可从电子健康记录(EHR)中提取SDoH,以辅助政策制定和研究,从而改善人群健康。使用人工智能(AI)进行自动提取可提高效率和成本效益。本研究的重点是使用基于自然语言处理(NLP)的AI管道从EHR中自主提取全面的SDoH详细信息。

材料与方法

一组精心挑选的1000份包含集中SDoH信息的卑诗省癌症临床文档用作训练和评估NLP模型的参考标准。使用了两个管道:一个在带注释的医学文档上训练的开源管道,以及一个用作基准的工业预训练解决方案。进行了三个实验来优化第一个管道的性能,评估在训练期间纳入子类型词位置的影响。然后使用 superior开源管道从13258份肿瘤学文档中提取SDoH信息。

结果

开源管道在提取13个SDoH因素的验证数据集上实现了平均F1分数准确率为0.88,比基准高出5%。它在详细的子类型提取方面表现出色,而基准在识别卑诗省癌症数据集中注释较少的SDoH信息方面表现更好。总体而言,从卑诗省癌症EHR肿瘤学文档中提取了60717个SDoH因素及相关详细信息。最常提取的SDoH因素包括烟草使用、就业状况、婚姻状况、酒精消费和居住状况,出现次数在8000至12000次之间。

结论

本研究证明了NLP管道从临床记录中提取SDoH因素的潜力,在有限数据上表现强劲,尽管需要针对特定数据集进行调整才能在各机构中更广泛地应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/71136179103e/cci-9-e2400317-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/46d561802b13/cci-9-e2400317-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/98e0842f47e6/cci-9-e2400317-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/71136179103e/cci-9-e2400317-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/46d561802b13/cci-9-e2400317-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/98e0842f47e6/cci-9-e2400317-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cc/12309507/71136179103e/cci-9-e2400317-g003.jpg

相似文献

1
Extraction of Social Determinants of Health From Electronic Health Records Using Natural Language Processing.使用自然语言处理技术从电子健康记录中提取健康的社会决定因素
JCO Clin Cancer Inform. 2025 Jul;9:e2400317. doi: 10.1200/CCI-24-00317. Epub 2025 Jul 23.
2
Scalable information extraction from free text electronic health records using large language models.使用大语言模型从自由文本电子健康记录中进行可扩展的信息提取。
BMC Med Res Methodol. 2025 Jan 28;25(1):23. doi: 10.1186/s12874-025-02470-z.
3
Natural Language Processing and Social Determinants of Health in Mental Health Research: AI-Assisted Scoping Review.心理健康研究中的自然语言处理与健康的社会决定因素:人工智能辅助的范围综述
JMIR Ment Health. 2025 Jan 16;12:e67192. doi: 10.2196/67192.
4
The 2022 n2c2/UW shared task on extracting social determinants of health.2022 年 n2c2/UW 关于提取健康社会决定因素的共享任务。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1367-1378. doi: 10.1093/jamia/ocad012.
5
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
6
Are Detailed, Patient-level Social Determinant of Health Factors Associated With Physical Function and Mental Health at Presentation Among New Patients With Orthopaedic Conditions?详细的患者层面的健康社会决定因素是否与新骨科患者就诊时的身体功能和心理健康相关?
Clin Orthop Relat Res. 2023 May 1;481(5):912-921. doi: 10.1097/CORR.0000000000002446. Epub 2022 Oct 6.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Machine Learning and Natural Language Processing in Mental Health: Systematic Review.机器学习和自然语言处理在心理健康中的应用:系统综述。
J Med Internet Res. 2021 May 4;23(5):e15708. doi: 10.2196/15708.
9
Sexual Harassment and Prevention Training性骚扰与预防培训
10
Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.使用自然语言处理从阿尔茨海默病患者的临床记录中提取睡眠信息。
J Am Med Inform Assoc. 2024 Oct 1;31(10):2217-2227. doi: 10.1093/jamia/ocae177.

本文引用的文献

1
Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.利用自然语言处理技术增强电子健康记录中的结构化社会决定因素健康数据。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073.
2
The 2022 n2c2/UW shared task on extracting social determinants of health.2022 年 n2c2/UW 关于提取健康社会决定因素的共享任务。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1367-1378. doi: 10.1093/jamia/ocad012.
3
Social determinants of breast cancer risk, stage, and survival.
乳腺癌风险、分期和生存的社会决定因素。
Breast Cancer Res Treat. 2019 Oct;177(3):537-548. doi: 10.1007/s10549-019-05340-7. Epub 2019 Jul 3.
4
Social Determinants of LGBT Cancer Health Inequities.LGBT群体癌症健康不平等的社会决定因素
Semin Oncol Nurs. 2018 Feb;34(1):12-20. doi: 10.1016/j.soncn.2017.11.001. Epub 2018 Jan 17.
5
A Systematic Review of Interventions on Patients' Social and Economic Needs.干预患者社会和经济需求的系统评价
Am J Prev Med. 2017 Nov;53(5):719-729. doi: 10.1016/j.amepre.2017.05.011. Epub 2017 Jul 5.
6
Using self-reported data on the social determinants of health in primary care to identify cancer screening disparities: opportunities and challenges.利用初级保健中自我报告的健康社会决定因素数据来识别癌症筛查差异:机遇与挑战。
BMC Fam Pract. 2017 Feb 28;18(1):31. doi: 10.1186/s12875-017-0599-z.
7
Taking action on the social determinants of health in clinical practice: a framework for health professionals.在临床实践中针对健康的社会决定因素采取行动:卫生专业人员框架
CMAJ. 2016 Dec 6;188(17-18):E474-E483. doi: 10.1503/cmaj.160177. Epub 2016 Aug 8.
8
Social determinants of Black-White disparities in breast cancer mortality: a review.乳腺癌死亡率中黑人与白人差异的社会决定因素:综述
Cancer Epidemiol Biomarkers Prev. 2008 Nov;17(11):2913-23. doi: 10.1158/1055-9965.EPI-07-0633.
9
The social determinants of cancer: a challenge for transdisciplinary science.癌症的社会决定因素:跨学科科学面临的一项挑战。
Am J Prev Med. 2008 Aug;35(2 Suppl):S141-50. doi: 10.1016/j.amepre.2008.05.006.