• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用大语言模型从临床病例报告中重建脓毒症轨迹:脓毒症文本时间序列语料库

Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis.

作者信息

Noroozizadeh Shahriar, Weiss Jeremy C

出版信息

ArXiv. 2025 Apr 12:arXiv:2504.12326v1.

PMID:40735076
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12306827/
Abstract

Clinical case reports and discharge summaries may be the most complete and accurate summarization of patient encounters, yet they are finalized, i.e., timestamped after the encounter. Complementary data structured streams become available sooner but suffer from incompleteness. To train models and algorithms on more complete and temporally fine-grained data, we construct a pipeline to phenotype, extract, and annotate time-localized findings within case reports using large language models. We apply our pipeline to generate an open-access textual time series corpus for Sepsis-3 comprising 2,139 case reports from the Pubmed-Open Access (PMOA) Subset. To validate our system, we apply it on PMOA and timeline annotations from I2B2/MIMIC-IV and compare the results to physician-expert annotations. We show high recovery rates of clinical findings (event match rates: O1-preview--0.755, Llama 3.3 70B Instruct--0.753) and strong temporal ordering (concordance: O1-preview--0.932, Llama 3.3 70B Instruct--0.932). Our work characterizes the ability of LLMs to time-localize clinical findings in text, illustrating the limitations of LLM use for temporal reconstruction and providing several potential avenues of improvement via multimodal integration.

摘要

临床病例报告和出院小结可能是对患者诊疗情况最完整、准确的总结,但它们是在诊疗结束后才最终确定的,即带有时间戳。补充性的结构化数据流虽然能更快获取,但存在不完整性。为了在更完整且时间粒度更细的数据上训练模型和算法,我们构建了一个管道,利用大语言模型对病例报告中的时间定位发现进行表型分析、提取和标注。我们应用我们的管道生成了一个用于脓毒症-3的开放获取文本时间序列语料库,该语料库包含来自PubMed开放获取(PMOA)子集的2139份病例报告。为了验证我们的系统,我们将其应用于来自I2B2/MIMIC-IV的PMOA和时间线标注,并将结果与医生专家的标注进行比较。我们展示了临床发现的高恢复率(事件匹配率:O1-preview——0.755,Llama 3.3 70B Instruct——0.753)和很强的时间顺序性(一致性:O1-preview——0.932,Llama 3.3 70B Instruct——0.932)。我们的工作刻画了大语言模型在文本中对临床发现进行时间定位的能力,阐明了大语言模型在时间重建方面的使用局限性,并通过多模态整合提供了几个潜在的改进途径。

相似文献

1
Reconstructing Sepsis Trajectories from Clinical Case Reports using LLMs: the Textual Time Series Corpus for Sepsis.使用大语言模型从临床病例报告中重建脓毒症轨迹:脓毒症文本时间序列语料库
ArXiv. 2025 Apr 12:arXiv:2504.12326v1.
2
A dataset and benchmark for hospital course summarization with adapted large language models.一个用于医院病程总结的数据集和基准测试,采用了适配的大语言模型。
J Am Med Inform Assoc. 2025 Mar 1;32(3):470-479. doi: 10.1093/jamia/ocae312.
3
Data extraction from free-text stroke CT reports using GPT-4o and Llama-3.3-70B: the impact of annotation guidelines.使用GPT-4o和Llama-3.3-70B从自由文本中风CT报告中提取数据:注释指南的影响
Eur Radiol Exp. 2025 Jun 19;9(1):61. doi: 10.1186/s41747-025-00600-2.
4
Automated Transformation of Unstructured Cardiovascular Diagnostic Reports into Structured Datasets Using Sequentially Deployed Large Language Models.使用顺序部署的大语言模型将非结构化心血管诊断报告自动转换为结构化数据集
medRxiv. 2024 Oct 8:2024.10.08.24315035. doi: 10.1101/2024.10.08.24315035.
5
Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation.通过在出院小结中添加重点内容提高大语言模型的总结准确性:比较评估
JMIR Med Inform. 2025 Jul 24;13:e66476. doi: 10.2196/66476.
6
Implementing Large Language Models in Health Care: Clinician-Focused Review With Interactive Guideline.在医疗保健中应用大语言模型:以临床医生为重点的回顾与交互式指南
J Med Internet Res. 2025 Jul 11;27:e71916. doi: 10.2196/71916.
7
Using Open-Source Large Language Models to Identify Access to Germline Genetic Testing in Veterans With Breast Cancer From Unstructured Text.利用开源大语言模型从非结构化文本中识别乳腺癌退伍军人获得种系基因检测的情况。
JCO Clin Cancer Inform. 2025 Jul;9:e2400263. doi: 10.1200/CCI-24-00263. Epub 2025 Jul 22.
8
Privacy-Preserving Generation of Structured Lymphoma Progression Reports from Cross-sectional Imaging: A Comparative Analysis of Llama 3.3 and Llama 4.基于横断面成像的淋巴瘤进展结构化报告的隐私保护生成:Llama 3.3与Llama 4的比较分析
J Imaging Inform Med. 2025 Jul 25. doi: 10.1007/s10278-025-01618-z.
9
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
10
Evaluating the effectiveness of biomedical fine-tuning for large language models on clinical tasks.评估生物医学微调对大语言模型在临床任务上的有效性。
J Am Med Inform Assoc. 2025 Jun 1;32(6):1015-1024. doi: 10.1093/jamia/ocaf045.