• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用大语言模型对产后出血进行零样本可解释表型分析。

Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models.

作者信息

Alsentzer Emily, Rasmussen Matthew J, Fontoura Romy, Cull Alexis L, Beaulieu-Jones Brett, Gray Kathryn J, Bates David W, Kovacheva Vesela P

机构信息

Division of General Internal Medicine and Primary Care, Brigham and Women's Hospital, Boston, MA, USA.

Department of Anesthesiology, Perioperative and Pain Medicine, Brigham and Women's Hospital, Boston, MA, USA.

出版信息

NPJ Digit Med. 2023 Nov 30;6(1):212. doi: 10.1038/s41746-023-00957-x.

DOI:10.1038/s41746-023-00957-x
PMID:38036723
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10689487/
Abstract

Many areas of medicine would benefit from deeper, more accurate phenotyping, but there are limited approaches for phenotyping using clinical notes without substantial annotated data. Large language models (LLMs) have demonstrated immense potential to adapt to novel tasks with no additional training by specifying task-specific instructions. Here we report the performance of a publicly available LLM, Flan-T5, in phenotyping patients with postpartum hemorrhage (PPH) using discharge notes from electronic health records (n = 271,081). The language model achieves strong performance in extracting 24 granular concepts associated with PPH. Identifying these granular concepts accurately allows the development of interpretable, complex phenotypes and subtypes. The Flan-T5 model achieves high fidelity in phenotyping PPH (positive predictive value of 0.95), identifying 47% more patients with this complication compared to the current standard of using claims codes. This LLM pipeline can be used reliably for subtyping PPH and outperforms a claims-based approach on the three most common PPH subtypes associated with uterine atony, abnormal placentation, and obstetric trauma. The advantage of this approach to subtyping is its interpretability, as each concept contributing to the subtype determination can be evaluated. Moreover, as definitions may change over time due to new guidelines, using granular concepts to create complex phenotypes enables prompt and efficient updating of the algorithm. Using this language modelling approach enables rapid phenotyping without the need for any manually annotated training data across multiple clinical use cases.

摘要

医学的许多领域都将受益于更深入、更准确的表型分析,但在没有大量注释数据的情况下,使用临床记录进行表型分析的方法有限。大语言模型(LLMs)已显示出通过指定特定任务的指令,在无需额外训练的情况下适应新任务的巨大潜力。在此,我们报告了一个公开可用的大语言模型Flan-T5,在使用电子健康记录中的出院记录(n = 271,081)对产后出血(PPH)患者进行表型分析方面的表现。该语言模型在提取与PPH相关的24个精细概念方面表现出色。准确识别这些精细概念有助于开发可解释的复杂表型和亚型。Flan-T5模型在PPH表型分析中实现了高保真度(阳性预测值为0.95),与使用理赔代码的当前标准相比,识别出的该并发症患者多47%。这种大语言模型流程可可靠地用于PPH亚型分析,并且在与子宫收缩乏力、胎盘异常和产科创伤相关的三种最常见PPH亚型上优于基于理赔的方法。这种亚型分析方法的优势在于其可解释性,因为每个有助于亚型确定的概念都可以进行评估。此外由于新指南的原因,定义可能会随时间变化,使用精细概念创建复杂表型能够迅速有效地更新算法。使用这种语言建模方法能够在多个临床用例中无需任何人工注释的训练数据即可快速进行表型分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/51c4f8070225/41746_2023_957_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/98cc322c94aa/41746_2023_957_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/ca9e5114fd3d/41746_2023_957_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/61d85ce2fd6c/41746_2023_957_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/51c4f8070225/41746_2023_957_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/98cc322c94aa/41746_2023_957_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/ca9e5114fd3d/41746_2023_957_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/61d85ce2fd6c/41746_2023_957_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b55d/10689487/51c4f8070225/41746_2023_957_Fig4_HTML.jpg

相似文献

1
Zero-shot interpretable phenotyping of postpartum hemorrhage using large language models.使用大语言模型对产后出血进行零样本可解释表型分析。
NPJ Digit Med. 2023 Nov 30;6(1):212. doi: 10.1038/s41746-023-00957-x.
2
Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models.使用大语言模型对产后出血进行零样本可解释表型分析。
medRxiv. 2023 Jun 1:2023.05.31.23290753. doi: 10.1101/2023.05.31.23290753.
3
Leveraging GPT-4 for identifying cancer phenotypes in electronic health records: a performance comparison between GPT-4, GPT-3.5-turbo, Flan-T5, Llama-3-8B, and spaCy's rule-based and machine learning-based methods.利用GPT-4在电子健康记录中识别癌症表型:GPT-4、GPT-3.5-turbo、Flan-T5、Llama-3-8B与spaCy基于规则和基于机器学习的方法之间的性能比较。
JAMIA Open. 2024 Jul 3;7(3):ooae060. doi: 10.1093/jamiaopen/ooae060. eCollection 2024 Oct.
4
Extracting social support and social isolation information from clinical psychiatry notes: comparing a rule-based natural language processing system and a large language model.从临床精神病学笔记中提取社会支持和社会隔离信息:比较基于规则的自然语言处理系统和大语言模型
J Am Med Inform Assoc. 2025 Jan 1;32(1):218-226. doi: 10.1093/jamia/ocae260.
5
Ensembles of natural language processing systems for portable phenotyping solutions.用于便携表型解决方案的自然语言处理系统集合。
J Biomed Inform. 2019 Dec;100:103318. doi: 10.1016/j.jbi.2019.103318. Epub 2019 Oct 23.
6
Large language models to identify social determinants of health in electronic health records.利用大语言模型识别电子健康记录中的健康社会决定因素。
NPJ Digit Med. 2024 Jan 11;7(1):6. doi: 10.1038/s41746-023-00970-0.
7
A comprehensive digital phenotype for postpartum hemorrhage.产后出血的综合数字表型。
J Am Med Inform Assoc. 2022 Jan 12;29(2):321-328. doi: 10.1093/jamia/ocab181.
8
Improving the use of LLMs in radiology through prompt engineering: from precision prompts to zero-shot learning.通过提示工程提高放射科对大语言模型的使用:从精准提示到零样本学习。
Rofo. 2024 Nov;196(11):1166-1170. doi: 10.1055/a-2264-5631. Epub 2024 Feb 26.
9
Active management of the third stage of labour: prevention and treatment of postpartum hemorrhage.第三产程的积极管理:产后出血的预防与治疗
J Obstet Gynaecol Can. 2009 Oct;31(10):980-993. doi: 10.1016/S1701-2163(16)34329-8.
10
The epidemiology of postpartum hemorrhage in a large, nationwide sample of deliveries.全国范围内大量分娩样本中的产后出血流行病学。
Anesth Analg. 2010 May 1;110(5):1368-73. doi: 10.1213/ANE.0b013e3181d74898. Epub 2010 Mar 17.

引用本文的文献

1
In-Context Learning with Large Language Models: A Simple and Effective Approach to Improve Radiology Report Labeling.利用大语言模型进行上下文学习:一种改进放射学报告标注的简单有效方法。
Healthc Inform Res. 2025 Jul;31(3):295-309. doi: 10.4258/hir.2025.31.3.295. Epub 2025 Jul 31.
2
Accuracy of Large Language Models to Identify Stroke Subtypes Within Unstructured Electronic Health Record Data.大语言模型在非结构化电子健康记录数据中识别中风亚型的准确性。
Stroke. 2025 Jul 25. doi: 10.1161/STROKEAHA.125.051993.
3
Extraction of Normalized Symptom Mentions From Clinical Narratives Using Large Language Models.

本文引用的文献

1
Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context.利用预训练语言模型和电子健康记录上下文提取生物医学事实知识。
AMIA Annu Symp Proc. 2023 Apr 29;2022:1188-1197. eCollection 2022.
2
HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing.健康提示:一种临床自然语言处理的零样本学习范式。
AMIA Annu Symp Proc. 2023 Apr 29;2022:972-981. eCollection 2022.
3
Postpartum Hemorrhage Trends and Outcomes in the United States, 2000-2019.2000 - 2019年美国产后出血趋势及结局
使用大语言模型从临床叙述中提取标准化症状提及
AMIA Annu Symp Proc. 2025 May 22;2024:600-609. eCollection 2024.
4
Evaluation and Bias Analysis of Large Language Models in Generating Synthetic Electronic Health Records: Comparative Study.大语言模型生成合成电子健康记录的评估与偏差分析:比较研究
J Med Internet Res. 2025 May 12;27:e65317. doi: 10.2196/65317.
5
Synthetic data distillation enables the extraction of clinical information at scale.合成数据提炼能够大规模提取临床信息。
NPJ Digit Med. 2025 May 10;8(1):267. doi: 10.1038/s41746-025-01681-4.
6
Understanding contraceptive switching rationales from real world clinical notes using large language models.使用大语言模型从真实世界临床记录中理解避孕方法转换的基本原理。
NPJ Digit Med. 2025 Apr 23;8(1):221. doi: 10.1038/s41746-025-01615-0.
7
Performance of popular large language models in glaucoma patient education: A randomized controlled study.流行的大语言模型在青光眼患者教育中的表现:一项随机对照研究。
Adv Ophthalmol Pract Res. 2024 Dec 3;5(2):88-94. doi: 10.1016/j.aopr.2024.12.002. eCollection 2025 May-Jun.
8
Decoding substance use disorder severity from clinical notes using a large language model.使用大语言模型从临床记录中解码物质使用障碍的严重程度
Npj Ment Health Res. 2025 Feb 7;4(1):5. doi: 10.1038/s44184-024-00114-6.
9
Large language models improve the identification of emergency department visits for symptomatic kidney stones.大型语言模型可改善有症状肾结石急诊就诊的识别。
Sci Rep. 2025 Jan 28;15(1):3503. doi: 10.1038/s41598-025-86632-5.
10
Clinical entity augmented retrieval for clinical information extraction.用于临床信息提取的临床实体增强检索
NPJ Digit Med. 2025 Jan 19;8(1):45. doi: 10.1038/s41746-024-01377-1.
Obstet Gynecol. 2023 Jan 1;141(1):152-161. doi: 10.1097/AOG.0000000000004972. Epub 2022 Nov 30.
4
Phenotyping of Elderly Patients With Heart Failure Focused on Noncardiac Conditions: A Latent Class Analysis From a Multicenter Registry of Patients Hospitalized With Heart Failure.老年心力衰竭患者表型分析侧重于非心脏疾病:一项来自心力衰竭住院患者多中心登记的潜在类别分析。
J Am Heart Assoc. 2023 Feb 7;12(3):e027689. doi: 10.1161/JAHA.122.027689. Epub 2023 Jan 25.
5
Machine learning approaches for electronic health records phenotyping: a methodical review.基于机器学习的电子健康记录表型分析方法:系统评价
J Am Med Inform Assoc. 2023 Jan 18;30(2):367-381. doi: 10.1093/jamia/ocac216.
6
Development of clinical risk-prediction models for uterine atony following vaginal and cesarean delivery.阴道分娩和剖宫产术后宫缩乏力的临床风险预测模型的建立。
Int J Obstet Anesth. 2022 Aug;51:103550. doi: 10.1016/j.ijoa.2022.103550. Epub 2022 Apr 22.
7
A comprehensive digital phenotype for postpartum hemorrhage.产后出血的综合数字表型。
J Am Med Inform Assoc. 2022 Jan 12;29(2):321-328. doi: 10.1093/jamia/ocab181.
8
Patient Recruitment Into a Multicenter Clinical Cohort Linking Electronic Health Records From 5 Health Systems: Cross-sectional Analysis.从 5 个医疗系统的电子健康记录中招募多中心临床队列研究患者:横断面分析。
J Med Internet Res. 2021 May 27;23(5):e24003. doi: 10.2196/24003.
9
Hospital discharge data is not accurate enough to monitor the incidence of postpartum hemorrhage.医院出院数据不够准确,无法监测产后出血的发病率。
PLoS One. 2021 Feb 3;16(2):e0246119. doi: 10.1371/journal.pone.0246119. eCollection 2021.
10
Validation of algorithms to identify adverse perinatal outcomes in the Medicaid Analytic Extract database.验证算法以识别 Medicaid 分析提取数据库中的不良围产期结局。
Pharmacoepidemiol Drug Saf. 2020 Apr;29(4):419-426. doi: 10.1002/pds.4967. Epub 2020 Mar 2.