预测 SOAP 笔记各部分之间的关系：纳入临床信息模型的价值。

Predicting relations between SOAP note sections: The value of incorporating a clinical information model.

机构信息

Section for Biomedical Informatics and Data Science, Yale University School of Medicine, 300 George St, 06511, New Haven, USA; Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA; Program of Computational Biology and Bioinformatics, Yale University, 300 George St, New Haven, 06511, USA.

Department of Emergency Medicine, Yale University School of Medicine, 464 Congress Ave #260, New Haven, 06519, USA.

出版信息

J Biomed Inform. 2023 May;141:104360. doi: 10.1016/j.jbi.2023.104360. Epub 2023 Apr 14.

DOI:10.1016/j.jbi.2023.104360

PMID:37061014

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10197152/

Abstract

Physician progress notes are frequently organized into Subjective, Objective, Assessment, and Plan (SOAP) sections. The Assessment section synthesizes information recorded in the Subjective and Objective sections, and the Plan section documents tests and treatments to narrow the differential diagnosis and manage symptoms. Classifying the relationship between the Assessment and Plan sections has been suggested to provide valuable insight into clinical reasoning. In this work, we use a novel human-in-the-loop pipeline to classify the relationships between the Assessment and Plan sections of SOAP notes as a part of the n2c2 2022 Track 3 Challenge. In particular, we use a clinical information model constructed from both the entailment logic expected from the aforementioned Challenge and the problem-oriented medical record. This information model is used to label named entities as primary and secondary problems/symptoms, events and complications in all four SOAP sections. We iteratively train separate Named Entity Recognition models and use them to annotate entities in all notes/sections. We fine-tune a downstream RoBERTa-large model to classify the Assessment-Plan relationship. We evaluate multiple language model architectures, preprocessing parameters, and methods of knowledge integration, achieving a maximum macro-F1 score of 82.31%. Our initial model achieves top-2 performance during the challenge (macro-F1: 81.52%, competitors' macro-F1 range: 74.54%-82.12%). We improved our model by incorporating post-challenge annotations (S&O sections), outperforming the top model from the Challenge. We also used Shapley additive explanations to investigate the extent of language model clinical logic, under the lens of our clinical information model. We find that the model often uses shallow heuristics and nonspecific attention when making predictions, suggesting language model knowledge integration requires further research.

摘要

医生的病程记录通常分为主观、客观、评估和计划（SOAP）部分。评估部分综合了主观和客观部分记录的信息，计划部分记录了测试和治疗方法，以缩小鉴别诊断范围并治疗症状。有人建议对评估和计划部分之间的关系进行分类，以便深入了解临床推理。在这项工作中，我们使用了一种新颖的人机交互管道，将 SOAP 记录的评估和计划部分之间的关系分类，作为 n2c2 2022 年第 3 赛道挑战赛的一部分。特别是，我们使用了一种从上述挑战赛的蕴涵逻辑和面向问题的医疗记录中构建的临床信息模型。该信息模型用于标记命名实体为主次问题/症状、所有四个 SOAP 部分的事件和并发症。我们迭代地训练独立的命名实体识别模型，并使用它们对所有笔记/部分的实体进行注释。我们微调下游的 RoBERTa-large 模型来分类评估-计划关系。我们评估了多种语言模型架构、预处理参数和知识集成方法，实现了 82.31%的最大宏 F1 得分。我们的初始模型在挑战赛期间达到了前 2 名的成绩（宏 F1：81.52%，竞争对手的宏 F1 范围：74.54%-82.12%）。我们通过整合挑战赛之后的注释（S&O 部分）改进了我们的模型，超过了挑战赛的最佳模型。我们还使用 Shapley 加法解释来研究语言模型临床逻辑的程度，从我们的临床信息模型的角度来看。我们发现，该模型在进行预测时经常使用浅层启发式和非特定注意力，这表明语言模型的知识集成需要进一步研究。

相似文献

Predicting relations between SOAP note sections: The value of incorporating a clinical information model.预测 SOAP 笔记各部分之间的关系：纳入临床信息模型的价值。

J Biomed Inform. 2023 May;141:104360. doi: 10.1016/j.jbi.2023.104360. Epub 2023 Apr 14.

A hybrid system to understand the relations between assessments and plans in progress notes.一种混合系统，用于理解在进行中的记录中的评估和计划之间的关系。

J Biomed Inform. 2023 May;141:104363. doi: 10.1016/j.jbi.2023.104363. Epub 2023 Apr 11.

Progress Note Understanding - Assessment and Plan Reasoning: Overview of the 2022 N2C2 Track 3 shared task.进展记录理解-评估和计划推理：2022 年 N2C2 第 3 轨道共享任务概述。

J Biomed Inform. 2023 Jun;142:104346. doi: 10.1016/j.jbi.2023.104346. Epub 2023 Apr 13.

Modeling problem-oriented clinical notes.面向问题的临床记录建模。

Methods Inf Med. 2012;51(6):507-15. doi: 10.3414/ME11-01-0064. Epub 2012 Nov 16.

Dynamic Electronic Health Record Note Prototype: Seeing More by Showing Less.动态电子健康记录笔记原型：少展示多呈现

J Am Board Fam Med. 2017 Nov-Dec;30(6):691-700. doi: 10.3122/jabfm.2017.06.170028.

Extracting clinical named entity for pituitary adenomas from Chinese electronic medical records.从中文电子病历中提取垂体腺瘤的临床命名实体。

BMC Med Inform Decis Mak. 2022 Mar 23;22(1):72. doi: 10.1186/s12911-022-01810-z.

Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models.利用大语言模型从急诊科临床记录中识别尿路感染的体征和症状。

Acad Emerg Med. 2024 Jun;31(6):599-610. doi: 10.1111/acem.14883. Epub 2024 Apr 3.

A deep learning model incorporating part of speech and self-matching attention for named entity recognition of Chinese electronic medical records.基于词性和自匹配注意力的深度学习模型在中文电子病历命名实体识别中的应用。

BMC Med Inform Decis Mak. 2019 Apr 9;19(Suppl 2):65. doi: 10.1186/s12911-019-0762-7.

A Hybrid Model for Family History Information Identification and Relation Extraction: Development and Evaluation of an End-to-End Information Extraction System.一种用于家族病史信息识别与关系抽取的混合模型：一个端到端信息抽取系统的开发与评估

JMIR Med Inform. 2021 Apr 22;9(4):e22797. doi: 10.2196/22797.

Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。

J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

引用本文的文献

Evaluating Improvement in the Documentation of Patient Progress Notes Through the BSOAP Framework: A Clinical Audit.通过BSOAP框架评估患者病程记录文档的改进情况：一项临床审计

Cureus. 2025 Aug 13;17(8):e89980. doi: 10.7759/cureus.89980. eCollection 2025 Aug.

AI Scribes in Health Care: Balancing Transformative Potential With Responsible Integration.医疗保健领域的人工智能抄写员：平衡变革潜力与负责任的整合

JMIR Med Inform. 2025 Aug 1;13:e80898. doi: 10.2196/80898.

Clinical natural language processing for secondary uses.用于二次利用的临床自然语言处理。

J Biomed Inform. 2024 Feb;150:104596. doi: 10.1016/j.jbi.2024.104596. Epub 2024 Jan 24.

本文引用的文献

Large language models encode clinical knowledge.大语言模型编码临床知识。

Nature. 2023 Aug;620(7972):172-180. doi: 10.1038/s41586-023-06291-2. Epub 2023 Jul 12.

Fine-tuning large neural language models for biomedical natural language processing.针对生物医学自然语言处理对大型神经语言模型进行微调。

Patterns (N Y). 2023 Apr 14;4(4):100729. doi: 10.1016/j.patter.2023.100729.

Progress Note Understanding - Assessment and Plan Reasoning: Overview of the 2022 N2C2 Track 3 shared task.进展记录理解-评估和计划推理：2022 年 N2C2 第 3 轨道共享任务概述。

J Biomed Inform. 2023 Jun;142:104346. doi: 10.1016/j.jbi.2023.104346. Epub 2023 Apr 13.

DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing.DR.BENCH：临床自然语言处理的诊断推理基准。

J Biomed Inform. 2023 Feb;138:104286. doi: 10.1016/j.jbi.2023.104286. Epub 2023 Jan 25.

Hierarchical Annotation for Building A Suite of Clinical Natural Language Processing Tasks: Progress Note Understanding.用于构建临床自然语言处理任务套件的分层标注：病程记录理解

LREC Int Conf Lang Resour Eval. 2022 Jun;2022:5484-5493.

MT-clinical BERT: scaling clinical information extraction with multitask learning.MT-clinical BERT：基于多任务学习的临床信息提取扩展。

J Am Med Inform Assoc. 2021 Sep 18;28(10):2108-2115. doi: 10.1093/jamia/ocab126.

A deep database of medical abbreviations and acronyms for natural language processing.用于自然语言处理的医学缩写和首字母缩略词的深度数据库。

Sci Data. 2021 Jun 2;8(1):149. doi: 10.1038/s41597-021-00929-4.

Relation Extraction from Clinical Narratives Using Pre-trained Language Models.使用预训练语言模型从临床叙述中提取关系

AMIA Annu Symp Proc. 2020 Mar 4;2019:1236-1245. eCollection 2019.

2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records.2018n2c2 电子健康记录中药物不良反应和药物提取共享任务。

J Am Med Inform Assoc. 2020 Jan 1;27(1):3-12. doi: 10.1093/jamia/ocz166.

Physician Burnout in the Electronic Health Record Era: Are We Ignoring the Real Cause?电子健康记录时代的医生职业倦怠：我们是否忽视了真正的原因？

Ann Intern Med. 2018 Jul 3;169(1):50-51. doi: 10.7326/M18-0139. Epub 2018 May 8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验