• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用预训练的序列到序列模型从医院病程记录中总结患者问题

Summarizing Patients' Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models.

作者信息

Gao Yanjun, Miller Timothy, Xu Dongfang, Dligach Dmitriy, Churpek Matthew M, Afshar Majid

机构信息

ICU Data Science Lab, School of Medicine and Public Health, University of Wisconsin-Madison.

Boston Children's Hospital, and Harvard Medical School.

出版信息

Proc Int Conf Comput Ling. 2022 Oct;2022:2979-2991.

PMID:36268128
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9581107/
Abstract

Automatically summarizing patients' main problems from daily progress notes using natural language processing methods helps to battle against information and cognitive overload in hospital settings and potentially assists providers with computerized diagnostic decision support. Problem list summarization requires a model to understand, abstract, and generate clinical documentation. In this work, we propose a new NLP task that aims to generate a list of problems in a patient's daily care plan using input from the provider's progress notes during hospitalization. We investigate the performance of T5 and BART, two state-of-the-art seq2seq transformer architectures, in solving this problem. We provide a corpus built on top of progress notes from publicly available electronic health record progress notes in the Medical Information Mart for Intensive Care (MIMIC)-III. T5 and BART are trained on general domain text, and we experiment with a data augmentation method and a domain adaptation pre-training method to increase exposure to medical vocabulary and knowledge. Evaluation methods include ROUGE, BERTScore, cosine similarity on sentence embedding, and F-score on medical concepts. Results show that T5 with domain adaptive pre-training achieves significant performance gains compared to a rule-based system and general domain pre-trained language models, indicating a promising direction for tackling the problem summarization task.

摘要

使用自然语言处理方法从日常病程记录中自动总结患者的主要问题,有助于应对医院环境中的信息和认知过载,并可能为医护人员提供计算机化诊断决策支持。问题列表总结需要一个模型来理解、抽象并生成临床文档。在这项工作中,我们提出了一项新的自然语言处理任务,旨在利用患者住院期间医护人员病程记录中的输入信息,生成患者日常护理计划中的问题列表。我们研究了两种最先进的序列到序列(seq2seq)变压器架构T5和BART在解决这个问题时的性能。我们提供了一个基于重症监护医学信息集市(MIMIC)-III中公开可用的电子健康记录病程记录构建的语料库。T5和BART是在通用领域文本上进行训练的,我们试验了一种数据增强方法和一种领域适应预训练方法,以增加对医学词汇和知识的接触。评估方法包括ROUGE、BERTScore、句子嵌入上的余弦相似度以及医学概念上的F分数。结果表明,与基于规则的系统和通用领域预训练语言模型相比,经过领域自适应预训练的T5在性能上有显著提升,这表明在解决问题总结任务方面有一个很有前景的方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/297b4234431f/nihms-1840629-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/9e8e5035178b/nihms-1840629-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/00c4ae0f7bd8/nihms-1840629-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/a80475f339b8/nihms-1840629-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/9a834a3ee14f/nihms-1840629-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/24bc3a1b0ec9/nihms-1840629-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/0bc0ad6050fd/nihms-1840629-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/1777304f146e/nihms-1840629-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/297b4234431f/nihms-1840629-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/9e8e5035178b/nihms-1840629-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/00c4ae0f7bd8/nihms-1840629-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/a80475f339b8/nihms-1840629-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/9a834a3ee14f/nihms-1840629-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/24bc3a1b0ec9/nihms-1840629-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/0bc0ad6050fd/nihms-1840629-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/1777304f146e/nihms-1840629-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d099/9581107/297b4234431f/nihms-1840629-f0007.jpg

相似文献

1
Summarizing Patients' Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models.使用预训练的序列到序列模型从医院病程记录中总结患者问题
Proc Int Conf Comput Ling. 2022 Oct;2022:2979-2991.
2
Exploring the potential of ChatGPT in medical dialogue summarization: a study on consistency with human preferences.探索 ChatGPT 在医学对话总结中的潜力:一项关于与人类偏好一致性的研究。
BMC Med Inform Decis Mak. 2024 Mar 14;24(1):75. doi: 10.1186/s12911-024-02481-8.
3
Overview of the Problem List Summarization (ProbSum) 2023 Shared Task on Summarizing Patients' Active Diagnoses and Problems from Electronic Health Record Progress Notes.2023年电子健康记录病程记录中患者当前诊断和问题总结的问题列表总结(ProbSum)共享任务概述
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:461-467. doi: 10.18653/v1/2023.bionlp-1.43.
4
A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.一种用于家族病史信息识别与关系抽取的混合模型:一个端到端信息抽取系统的开发与评估
JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.
5
Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models.从电子健康记录笔记中提取与药物安全监测相关的信息:使用知识感知神经注意力模型对实体和关系进行联合建模
JMIR Med Inform. 2020 Jul 10;8(7):e18417. doi: 10.2196/18417.
6
Predicting Semantic Similarity Between Clinical Sentence Pairs Using Transformer Models: Evaluation and Representational Analysis.使用Transformer模型预测临床句子对之间的语义相似性:评估与表征分析
JMIR Med Inform. 2021 May 26;9(5):e23099. doi: 10.2196/23099.
7
The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.2019年n2c2/OHNLP临床语义文本相似性赛道:概述
JMIR Med Inform. 2020 Nov 27;8(11):e23375. doi: 10.2196/23375.
8
Automatic generation of conclusions from neuroradiology MRI reports through natural language processing.通过自然语言处理自动生成神经放射学 MRI 报告的结论。
Neuroradiology. 2024 Apr;66(4):477-485. doi: 10.1007/s00234-024-03312-3. Epub 2024 Feb 21.
9
Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding.用于构建临床自然语言处理任务套件的分层标注:病程记录理解
LREC Int Conf Lang Resour Eval. 2022 Jun;2022:5484-5493.
10
Leveraging Summary Guidance on Medical Report Summarization.利用医疗报告总结中的指导意见。
IEEE J Biomed Health Inform. 2023 Oct;27(10):5066-5075. doi: 10.1109/JBHI.2023.3304376. Epub 2023 Oct 5.

引用本文的文献

1
A scoping review of natural language processing in addressing medically inaccurate information: Errors, misinformation, and hallucination.关于自然语言处理在处理医学错误信息方面的范围综述:错误、错误信息和幻觉。
J Biomed Inform. 2025 Jul 22:104866. doi: 10.1016/j.jbi.2025.104866.
2
Development of a Human Evaluation Framework and Correlation with Automated Metrics for Natural Language Generation of Medical Diagnoses.用于医学诊断自然语言生成的人工评估框架的开发及其与自动指标的相关性
AMIA Annu Symp Proc. 2025 May 22;2024:309-318. eCollection 2024.
3
Scientific Evidence for Clinical Text Summarization Using Large Language Models: Scoping Review.

本文引用的文献

1
Bias and fairness assessment of a natural language processing opioid misuse classifier: detection and mitigation of electronic health record data disadvantages across racial subgroups.自然语言处理阿片类药物滥用分类器的偏差和公平性评估:检测和减轻电子健康记录数据在不同种族亚组中的劣势。
J Am Med Inform Assoc. 2021 Oct 12;28(11):2393-2403. doi: 10.1093/jamia/ocab148.
2
Length and Redundancy of Outpatient Progress Notes Across a Decade at an Academic Medical Center.一所学术医疗中心十年间门诊病程记录的长度与冗余情况
JAMA Netw Open. 2021 Jul 1;4(7):e2115334. doi: 10.1001/jamanetworkopen.2021.15334.
3
What's in a Summary? Laying the Groundwork for Advances in Hospital-Course Summarization.
使用大语言模型进行临床文本摘要的科学证据:范围综述
J Med Internet Res. 2025 May 15;27:e68998. doi: 10.2196/68998.
4
Evaluating the effectiveness of biomedical fine-tuning for large language models on clinical tasks.评估生物医学微调对大语言模型在临床任务上的有效性。
J Am Med Inform Assoc. 2025 Jun 1;32(6):1015-1024. doi: 10.1093/jamia/ocaf045.
5
Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.利用人工智能改善临床文档记录:一项系统综述。
Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.
6
Current and future state of evaluation of large language models for medical summarization tasks.用于医学总结任务的大语言模型评估的当前及未来状况。
Npj Health Syst. 2025;2. doi: 10.1038/s44401-024-00011-2. Epub 2025 Feb 3.
7
Leveraging Medical Knowledge Graphs Into Large Language Models for Diagnosis Prediction: Design and Application Study.将医学知识图谱融入大语言模型进行诊断预测:设计与应用研究
JMIR AI. 2025 Feb 24;4:e58670. doi: 10.2196/58670.
8
On the role of the UMLS in supporting diagnosis generation proposed by Large Language Models.在支持大型语言模型提出的诊断生成中 UMLS 的作用。
J Biomed Inform. 2024 Sep;157:104707. doi: 10.1016/j.jbi.2024.104707. Epub 2024 Aug 13.
9
Exploring the Efficacy of Large Language Models in Summarizing Mental Health Counseling Sessions: Benchmark Study.探讨大型语言模型在总结心理健康咨询会话中的功效:基准研究。
JMIR Ment Health. 2024 Jul 23;11:e57306. doi: 10.2196/57306.
10
Evaluation of a Digital Scribe: Conversation Summarization for Emergency Department Consultation Calls.数字抄写员的评估:急诊科会诊电话的对话总结
Appl Clin Inform. 2024 May 15;15(3):600-11. doi: 10.1055/a-2327-4121.
摘要中有什么?为住院病程总结的进展奠定基础。
Proc Conf. 2021 Jun;2021:4794-4811. doi: 10.18653/v1/2021.naacl-main.382.
4
Knowledge-Infused Abstractive Summarization of Clinical Diagnostic Interviews: Framework Development Study.临床诊断访谈的知识注入式摘要生成:框架开发研究
JMIR Ment Health. 2021 May 10;8(5):e20865. doi: 10.2196/20865.
5
Structural Disparities in Data Science: A Prolegomenon for the Future of Machine Learning.数据科学中的结构差异:机器学习未来的绪论
Am J Bioeth. 2020 Nov;20(11):35-37. doi: 10.1080/15265161.2020.1820102.
6
Recommended Use of Terminology in Addiction Medicine.成瘾医学术语的推荐用法。
J Addict Med. 2021;15(1):3-7. doi: 10.1097/ADM.0000000000000673.
7
Information overload and unsustainable workloads in the era of electronic health records.电子健康记录时代的信息过载与不可持续的工作量。
Lancet Respir Med. 2020 Mar;8(3):243-244. doi: 10.1016/S2213-2600(20)30010-2. Epub 2020 Jan 13.
8
Challenges and Opportunities to Improve the Clinician Experience Reviewing Electronic Progress Notes.改善临床医生查看电子病历体验的挑战与机遇。
Appl Clin Inform. 2019 May;10(3):446-453. doi: 10.1055/s-0039-1692164. Epub 2019 Jun 19.
9
Owlready: Ontology-oriented programming in Python with automatic classification and high level constructs for biomedical ontologies.Owlready:用于生物医学本体的面向本体的Python编程,具备自动分类和高级构造。
Artif Intell Med. 2017 Jul;80:11-28. doi: 10.1016/j.artmed.2017.07.002. Epub 2017 Aug 14.
10
Automated problem list generation and physicians perspective from a pilot study.一项试点研究中的自动问题列表生成及医生视角
Int J Med Inform. 2017 Sep;105:121-129. doi: 10.1016/j.ijmedinf.2017.05.015. Epub 2017 Jun 4.