• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用大语言模型从急诊科临床记录中识别尿路感染的体征和症状。

Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models.

机构信息

Department of Emergency Medicine, Yale School of Medicine, New Haven, Connecticut, USA.

Section for Biomedical Informatics and Data Science, Yale University School of Medicine, New Haven, Connecticut, USA.

出版信息

Acad Emerg Med. 2024 Jun;31(6):599-610. doi: 10.1111/acem.14883. Epub 2024 Apr 3.

DOI:10.1111/acem.14883
PMID:38567658
Abstract

BACKGROUND

Natural language processing (NLP) tools including recently developed large language models (LLMs) have myriad potential applications in medical care and research, including the efficient labeling and classification of unstructured text such as electronic health record (EHR) notes. This opens the door to large-scale projects that rely on variables that are not typically recorded in a structured form, such as patient signs and symptoms.

OBJECTIVES

This study is designed to acquaint the emergency medicine research community with the foundational elements of NLP, highlighting essential terminology, annotation methodologies, and the intricacies involved in training and evaluating NLP models. Symptom characterization is critical to urinary tract infection (UTI) diagnosis, but identification of symptoms from the EHR has historically been challenging, limiting large-scale research, public health surveillance, and EHR-based clinical decision support. We therefore developed and compared two NLP models to identify UTI symptoms from unstructured emergency department (ED) notes.

METHODS

The study population consisted of patients aged ≥ 18 who presented to an ED in a northeastern U.S. health system between June 2013 and August 2021 and had a urinalysis performed. We annotated a random subset of 1250 ED clinician notes from these visits for a list of 17 UTI symptoms. We then developed two task-specific LLMs to perform the task of named entity recognition: a convolutional neural network-based model (SpaCy) and a transformer-based model designed to process longer documents (Clinical Longformer). Models were trained on 1000 notes and tested on a holdout set of 250 notes. We compared model performance (precision, recall, F1 measure) at identifying the presence or absence of UTI symptoms at the note level.

RESULTS

A total of 8135 entities were identified in 1250 notes; 83.6% of notes included at least one entity. Overall F1 measure for note-level symptom identification weighted by entity frequency was 0.84 for the SpaCy model and 0.88 for the Longformer model. F1 measure for identifying presence or absence of any UTI symptom in a clinical note was 0.96 (232/250 correctly classified) for the SpaCy model and 0.98 (240/250 correctly classified) for the Longformer model.

CONCLUSIONS

The study demonstrated the utility of LLMs and transformer-based models in particular for extracting UTI symptoms from unstructured ED clinical notes; models were highly accurate for detecting the presence or absence of any UTI symptom on the note level, with variable performance for individual symptoms.

摘要

背景

自然语言处理 (NLP) 工具包括最近开发的大型语言模型 (LLM),在医疗保健和研究中有无数潜在的应用,包括对电子健康记录 (EHR) 等非结构化文本进行高效的标记和分类。这为依赖于通常不以结构化形式记录的变量的大型项目打开了大门,例如患者的体征和症状。

目的

本研究旨在使急诊医学研究界了解 NLP 的基础要素,重点介绍基本术语、注释方法以及培训和评估 NLP 模型所涉及的复杂性。症状特征对于尿路感染 (UTI) 的诊断至关重要,但从 EHR 中识别症状一直具有挑战性,限制了大规模研究、公共卫生监测和基于 EHR 的临床决策支持。因此,我们开发并比较了两种 NLP 模型,以从非结构化急诊 (ED) 记录中识别 UTI 症状。

方法

研究人群包括 2013 年 6 月至 2021 年 8 月期间在美国东北部医疗系统就诊的年龄≥18 岁的患者,并进行了尿液分析。我们为这些就诊的 1250 名临床医生的随机亚组注释了一份 17 种 UTI 症状列表。然后,我们开发了两种专门用于执行命名实体识别任务的特定于任务的 LLM:基于卷积神经网络的模型 (SpaCy) 和专为处理较长文档而设计的基于转换器的模型 (Clinical Longformer)。模型在 1000 条记录上进行训练,并在 250 条保留记录上进行测试。我们比较了模型在识别笔记级 UTI 症状存在或不存在时的性能(精度、召回率、F1 度量)。

结果

在 1250 条记录中总共识别出 8135 个实体;83.6%的记录至少包含一个实体。基于实体频率加权的笔记级症状识别的整体 F1 度量对于 SpaCy 模型为 0.84,对于 Longformer 模型为 0.88。SpaCy 模型在临床记录中识别任何 UTI 症状存在或不存在的 F1 度量为 0.96(232/250 正确分类),Longformer 模型为 0.98(240/250 正确分类)。

结论

该研究证明了 LLM 和基于转换器的模型在提取非结构化 ED 临床记录中的 UTI 症状方面的实用性;模型在检测记录中任何 UTI 症状的存在或不存在方面非常准确,个别症状的性能存在差异。

相似文献

1
Identifying signs and symptoms of urinary tract infection from emergency department clinical notes using large language models.利用大语言模型从急诊科临床记录中识别尿路感染的体征和症状。
Acad Emerg Med. 2024 Jun;31(6):599-610. doi: 10.1111/acem.14883. Epub 2024 Apr 3.
2
Identifying Urinary Tract Infection-Related Information in Home Care Nursing Notes.在家庭护理记录中识别与尿路感染相关的信息。
J Am Med Dir Assoc. 2021 May;22(5):1015-1021.e2. doi: 10.1016/j.jamda.2020.12.010. Epub 2021 Jan 9.
3
Identifying Patient-Reported Outcome Measure Documentation in Veterans Health Administration Chiropractic Clinic Notes: Natural Language Processing Analysis.识别退伍军人健康管理局脊椎按摩诊所记录中的患者报告结局测量文档:自然语言处理分析
JMIR Med Inform. 2025 Apr 2;13:e66466. doi: 10.2196/66466.
4
Task definition, annotated dataset, and supervised natural language processing models for symptom extraction from unstructured clinical notes.从非结构化临床记录中提取症状的任务定义、标注数据集和监督自然语言处理模型。
J Biomed Inform. 2020 Feb;102:103354. doi: 10.1016/j.jbi.2019.103354. Epub 2019 Dec 12.
5
Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records.使用大语言模型注释纵向临床记录中健康社会决定因素的复杂病例。
medRxiv. 2024 Apr 27:2024.04.25.24306380. doi: 10.1101/2024.04.25.24306380.
6
Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data.使用自然语言处理方法从自由文本和非结构化患者生成的健康数据中提取医学信息:基于真实世界数据的可行性研究
JMIR Form Res. 2023 Mar 7;7:e43014. doi: 10.2196/43014.
7
Early identification of suspected serious infection among patients afebrile at initial presentation using neural network models and natural language processing: A development and external validation study in the emergency department.利用神经网络模型和自然语言处理技术在初始表现无发热的患者中早期识别疑似严重感染:急诊中的开发和外部验证研究。
Am J Emerg Med. 2024 Jun;80:67-76. doi: 10.1016/j.ajem.2024.03.006. Epub 2024 Mar 16.
8
Detecting the presence of an indwelling urinary catheter and urinary symptoms in hospitalized patients using natural language processing.使用自然语言处理技术检测住院患者体内留置导尿管的情况及泌尿系统症状。
J Biomed Inform. 2017 Jul;71S:S39-S45. doi: 10.1016/j.jbi.2016.07.012. Epub 2016 Jul 9.
9
Identifying incarceration status in the electronic health record using large language models in emergency department settings.在急诊科环境中使用大语言模型识别电子健康记录中的监禁状态。
J Clin Transl Sci. 2024 Mar 11;8(1):e53. doi: 10.1017/cts.2024.496. eCollection 2024.
10
Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study.用于心理健康预测模型的电子健康记录中非结构化文本分类:大语言模型评估研究
JMIR Med Inform. 2025 Jan 21;13:e65454. doi: 10.2196/65454.

引用本文的文献

1
Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education.绘制急诊医学中的人工智能模型:关于人工智能在急诊护理和教育中表现的范围综述。
Turk J Emerg Med. 2025 Apr 1;25(2):67-91. doi: 10.4103/tjem.tjem_45_25. eCollection 2025 Apr-Jun.
2
Words to live by: Using medic impressions to identify the need for prehospital lifesaving interventions.生存准则:利用医疗印象识别院前救生干预的需求。
Acad Emerg Med. 2025 May;32(5):516-525. doi: 10.1111/acem.15067. Epub 2025 Jan 24.
3
Patient-centric knowledge graphs: a survey of current methods, challenges, and applications.
以患者为中心的知识图谱:当前方法、挑战及应用综述
Front Artif Intell. 2024 Oct 23;7:1388479. doi: 10.3389/frai.2024.1388479. eCollection 2024.