Suppr超能文献

利用大语言模型识别电子健康记录中的健康社会决定因素。

Large language models to identify social determinants of health in electronic health records.

作者信息

Guevara Marco, Chen Shan, Thomas Spencer, Chaunzwa Tafadzwa L, Franco Idalid, Kann Benjamin H, Moningi Shalini, Qian Jack M, Goldstein Madeleine, Harper Susan, Aerts Hugo J W L, Catalano Paul J, Savova Guergana K, Mak Raymond H, Bitterman Danielle S

机构信息

Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School, Boston, MA, USA.

Department of Radiation Oncology, Brigham and Women's Hospital/Dana-Farber Cancer Institute, Boston, MA, USA.

出版信息

NPJ Digit Med. 2024 Jan 11;7(1):6. doi: 10.1038/s41746-023-00970-0.

Abstract

Social determinants of health (SDoH) play a critical role in patient outcomes, yet their documentation is often missing or incomplete in the structured data of electronic health records (EHRs). Large language models (LLMs) could enable high-throughput extraction of SDoH from the EHR to support research and clinical care. However, class imbalance and data limitations present challenges for this sparsely documented yet critical information. Here, we investigated the optimal methods for using LLMs to extract six SDoH categories from narrative text in the EHR: employment, housing, transportation, parental status, relationship, and social support. The best-performing models were fine-tuned Flan-T5 XL for any SDoH mentions (macro-F1 0.71), and Flan-T5 XXL for adverse SDoH mentions (macro-F1 0.70). Adding LLM-generated synthetic data to training varied across models and architecture, but improved the performance of smaller Flan-T5 models (delta F1 + 0.12 to +0.23). Our best-fine-tuned models outperformed zero- and few-shot performance of ChatGPT-family models in the zero- and few-shot setting, except GPT4 with 10-shot prompting for adverse SDoH. Fine-tuned models were less likely than ChatGPT to change their prediction when race/ethnicity and gender descriptors were added to the text, suggesting less algorithmic bias (p < 0.05). Our models identified 93.8% of patients with adverse SDoH, while ICD-10 codes captured 2.0%. These results demonstrate the potential of LLMs in improving real-world evidence on SDoH and assisting in identifying patients who could benefit from resource support.

摘要

健康的社会决定因素(SDoH)对患者的治疗结果起着关键作用,但在电子健康记录(EHR)的结构化数据中,它们的记录往往缺失或不完整。大语言模型(LLMs)能够从EHR中高通量提取SDoH,以支持研究和临床护理。然而,类别不平衡和数据限制给这一记录稀少但至关重要的信息带来了挑战。在此,我们研究了使用大语言模型从EHR中的叙述文本中提取六个SDoH类别的最佳方法:就业、住房、交通、父母状况、人际关系和社会支持。表现最佳的模型是针对任何SDoH提及进行微调的Flan-T5 XL(宏F1为0.71),以及针对不良SDoH提及进行微调的Flan-T5 XXL(宏F1为0.70)。在训练中添加大语言模型生成的合成数据因模型和架构而异,但提高了较小的Flan-T5模型的性能(F1增量为+0.12至+0.23)。我们经过最佳微调的模型在零样本和少样本设置下的表现优于ChatGPT系列模型的零样本和少样本性能,但GPT4在对不良SDoH进行10次少样本提示时除外。当在文本中添加种族/民族和性别描述符时,经过微调的模型比ChatGPT更不容易改变其预测,这表明算法偏差较小(p < 0.05)。我们的模型识别出93.8%的有不良SDoH的患者,而ICD-10编码仅识别出2.0%。这些结果证明了大语言模型在改善关于SDoH的真实世界证据以及协助识别可能从资源支持中受益的患者方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/120e/10781957/cfe9fe114e61/41746_2023_970_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验