• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用主动学习对健康的社会决定因素进行标注,并使用神经事件提取对决定因素进行特征描述。

Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction.

作者信息

Lybarger Kevin, Ostendorf Mari, Yetisgen Meliha

机构信息

Biomedical & Health Informatics, University of Washington, Box 358047 Seattle, WA 98109, USA.

Department of Electrical & Computer Engineering, University of Washington, Campus Box 352500 185, Seattle, WA 98195-2500, USA.

出版信息

J Biomed Inform. 2021 Jan;113:103631. doi: 10.1016/j.jbi.2020.103631. Epub 2020 Dec 5.

DOI:10.1016/j.jbi.2020.103631
PMID:33290878
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7856628/
Abstract

Social determinants of health (SDOH) affect health outcomes, and knowledge of SDOH can inform clinical decision-making. Automatically extracting SDOH information from clinical text requires data-driven information extraction models trained on annotated corpora that are heterogeneous and frequently include critical SDOH. This work presents a new corpus with SDOH annotations, a novel active learning framework, and the first extraction results on the new corpus. The Social History Annotation Corpus (SHAC) includes 4480 social history sections with detailed annotation for 12 SDOH characterizing the status, extent, and temporal information of 18K distinct events. We introduce a novel active learning framework that selects samples for annotation using a surrogate text classification task as a proxy for a more complex event extraction task. The active learning framework successfully increases the frequency of health risk factors and improves automatic extraction of these events over undirected annotation. An event extraction model trained on SHAC achieves high extraction performance for substance use status (0.82-0.93 F1), employment status (0.81-0.86 F1), and living status type (0.81-0.93 F1) on data from three institutions.

摘要

健康的社会决定因素(SDOH)会影响健康结果,而对SDOH的了解可为临床决策提供参考。从临床文本中自动提取SDOH信息需要基于经过注释的语料库训练的数据驱动信息提取模型,这些语料库具有异质性,且经常包含关键的SDOH。这项工作展示了一个带有SDOH注释的新语料库、一个新颖的主动学习框架以及新语料库上的首次提取结果。社会史注释语料库(SHAC)包含4480个社会史部分,对12个SDOH进行了详细注释,这些注释描述了18000个不同事件的状态、范围和时间信息。我们引入了一种新颖的主动学习框架,该框架使用代理文本分类任务来选择用于注释的样本,以此替代更复杂的事件提取任务。与无向注释相比,主动学习框架成功提高了健康风险因素的出现频率,并改善了这些事件的自动提取。在来自三个机构的数据上,基于SHAC训练的事件提取模型在物质使用状态(F1值为0.82 - 0.93)、就业状态(F1值为0.81 - 0.86)和居住状态类型(F1值为0.81 - 0.93)方面实现了较高的提取性能。

相似文献

1
Annotating social determinants of health using active learning, and characterizing determinants using neural event extraction.使用主动学习对健康的社会决定因素进行标注,并使用神经事件提取对决定因素进行特征描述。
J Biomed Inform. 2021 Jan;113:103631. doi: 10.1016/j.jbi.2020.103631. Epub 2020 Dec 5.
2
The 2022 n2c2/UW shared task on extracting social determinants of health.2022 年 n2c2/UW 关于提取健康社会决定因素的共享任务。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1367-1378. doi: 10.1093/jamia/ocad012.
3
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.
4
Extracting social determinants of health events with transformer-based multitask, multilabel named entity recognition.基于转换器的多任务、多标签命名实体识别技术提取健康事件的社会决定因素。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1379-1388. doi: 10.1093/jamia/ocad046.
5
Leveraging natural language processing to augment structured social determinants of health data in the electronic health record.利用自然语言处理技术增强电子健康记录中的结构化社会决定因素健康数据。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1389-1397. doi: 10.1093/jamia/ocad073.
6
Extracting social determinants of health from clinical note text with classification and sequence-to-sequence approaches.使用分类和序列到序列方法从临床记录文本中提取健康的社会决定因素。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1448-1455. doi: 10.1093/jamia/ocad071.
7
Identifying social determinants of health from clinical narratives: A study of performance, documentation ratio, and potential bias.从临床叙述中识别健康的社会决定因素:一项关于表现、记录比例和潜在偏差的研究。
J Biomed Inform. 2024 May;153:104642. doi: 10.1016/j.jbi.2024.104642. Epub 2024 Apr 14.
8
A marker-based neural network system for extracting social determinants of health.基于标记的神经网络系统,用于提取健康的社会决定因素。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1398-1407. doi: 10.1093/jamia/ocad041.
9
Large Language Models for Social Determinants of Health Information Extraction from Clinical Notes - A Generalizable Approach across Institutions.用于从临床记录中提取健康信息社会决定因素的大语言模型——一种适用于各机构的通用方法。
medRxiv. 2024 May 22:2024.05.21.24307726. doi: 10.1101/2024.05.21.24307726.
10
Extracting social determinants of health from electronic health records using natural language processing: a systematic review.利用自然语言处理从电子健康记录中提取健康的社会决定因素:系统评价。
J Am Med Inform Assoc. 2021 Nov 25;28(12):2716-2727. doi: 10.1093/jamia/ocab170.

引用本文的文献

1
Applying text-mining to clinical notes: the identification of patient characteristics from electronic health records (EHRs).将文本挖掘应用于临床记录:从电子健康记录(EHRs)中识别患者特征。
BMC Med Inform Decis Mak. 2025 Aug 12;25(1):302. doi: 10.1186/s12911-025-03137-x.
2
Deep learning for occupation recognition and knowledge discovery in rheumatology clinical notes.用于风湿科临床记录中职业识别和知识发现的深度学习
Sci Rep. 2025 Jul 1;15(1):20944. doi: 10.1038/s41598-025-05294-5.
3
Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs.

本文引用的文献

1
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
2
Active deep learning for the identification of concepts and relations in electroencephalography reports.主动深度学习在脑电图报告中概念和关系的识别。
J Biomed Inform. 2019 Oct;98:103265. doi: 10.1016/j.jbi.2019.103265. Epub 2019 Aug 27.
3
Efficient Active Learning for Electronic Medical Record De-identification.
利用基于大语言模型增强的文献挖掘和知识图谱,在阿尔茨海默病研究中利用健康的社会决定因素
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:491-500. eCollection 2025.
4
Tailoring task arithmetic to address bias in models trained on multi-institutional datasets.调整任务算法以解决在多机构数据集上训练的模型中的偏差问题。
J Biomed Inform. 2025 Aug;168:104858. doi: 10.1016/j.jbi.2025.104858. Epub 2025 Jun 8.
5
Clinical Significance of Marital Status and Changes in Status Extracted from Unstructured Clinical Notes Using Ensembles of Off-the-Shelf Extraction Models.婚姻状况及从非结构化临床记录中使用现成提取模型集成提取的状况变化的临床意义
Res Sq. 2025 May 5:rs.3.rs-6578415. doi: 10.21203/rs.3.rs-6578415/v1.
6
Social determinants of health extraction from clinical notes across institutions using large language models.使用大语言模型从各机构的临床记录中提取健康的社会决定因素。
NPJ Digit Med. 2025 May 17;8(1):287. doi: 10.1038/s41746-025-01645-8.
7
SBDH-Reader: an LLM-powered method for extracting social and behavioral determinants of health from medical notes.SBDH阅读器:一种由大型语言模型驱动的从医疗记录中提取健康的社会和行为决定因素的方法。
medRxiv. 2025 Feb 21:2025.02.19.25322576. doi: 10.1101/2025.02.19.25322576.
8
Decoding substance use disorder severity from clinical notes using a large language model.使用大语言模型从临床记录中解码物质使用障碍的严重程度
Npj Ment Health Res. 2025 Feb 7;4(1):5. doi: 10.1038/s44184-024-00114-6.
9
Big Data, Big Insights: Leveraging Data Analytics to Unravel Cardiovascular Exposome Complexities.大数据,大洞察:利用数据分析揭示心血管外显子组复杂性。
Methodist Debakey Cardiovasc J. 2024 Nov 5;20(5):111-123. doi: 10.14797/mdcvj.1467. eCollection 2024.
10
Large-scale identification of social and behavioral determinants of health from clinical notes: comparison of Latent Semantic Indexing and Generative Pretrained Transformer (GPT) models.从临床记录中大规模识别健康的社会和行为决定因素:潜在语义索引和生成式预训练转换器 (GPT) 模型的比较。
BMC Med Inform Decis Mak. 2024 Oct 10;24(1):296. doi: 10.1186/s12911-024-02705-x.
用于电子病历去识别化的高效主动学习
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:462-471. eCollection 2019.
4
Multitask learning and benchmarking with clinical time series data.多任务学习与临床时间序列数据的基准测试。
Sci Data. 2019 Jun 17;6(1):96. doi: 10.1038/s41597-019-0103-9.
5
Using Neural Multi-task Learning to Extract Substance Abuse Information from Clinical Notes.使用神经多任务学习从临床记录中提取药物滥用信息。
AMIA Annu Symp Proc. 2018 Dec 5;2018:1395-1404. eCollection 2018.
6
Towards the Inference of Social and Behavioral Determinants of Sexual Health: Development of a Gold-Standard Corpus with Semi-Supervised Learning.迈向性健康社会和行为决定因素的推断:利用半监督学习开发黄金标准语料库
AMIA Annu Symp Proc. 2018 Dec 5;2018:422-429. eCollection 2018.
7
Mortality in the United States, 2017.2017年美国的死亡率。
NCHS Data Brief. 2018 Nov(328):1-8.
8
Determined Action Needed on Social Determinants.需要对社会决定因素采取果断行动。
Ann Intern Med. 2018 Apr 17;168(8):596-597. doi: 10.7326/M18-0335.
9
Addressing Social Determinants to Improve Patient Care and Promote Health Equity: An American College of Physicians Position Paper.解决社会决定因素问题以改善患者护理和促进健康公平:美国医师学会立场文件。
Ann Intern Med. 2018 Apr 17;168(8):577-578. doi: 10.7326/M17-2441.
10
Leveraging the Learning Health Care Model to Improve Equity in the Age of Genomic Medicine.利用学习型医疗保健模式改善基因组医学时代的公平性。
Learn Health Syst. 2018 Jan;2(1). doi: 10.1002/lrh2.10046. Epub 2017 Nov 27.