用于电子健康记录中准确疾病检测的大语言模型：以晶体性关节病为例。

Large language models for accurate disease detection in electronic health records: the examples of crystal arthropathies.

作者信息

Bürgisser Nils, Chalot Etienne, Mehouachi Samia, Buclin Clement P, Lauper Kim, Courvoisier Delphine S, Mongin Denis

机构信息

Division of Rheumatology, Geneva University Hospitals, Geneva, Switzerland

Division of Internal Medicine, Geneva University Hospitals, Geneva, Switzerland.

出版信息

RMD Open. 2024 Dec 20;10(4):e005003. doi: 10.1136/rmdopen-2024-005003.

DOI:10.1136/rmdopen-2024-005003

PMID:39794274

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11664341/

Abstract

OBJECTIVES

We propose and test a framework to detect disease diagnosis using a recent large language model (LLM), Meta's Llama-3-8B, on French-language electronic health record (EHR) documents. Specifically, it focuses on detecting gout ('goutte' in French), a ubiquitous French term that has multiple meanings beyond the disease. The study compares the performance of the LLM-based framework with traditional natural language processing techniques and tests its dependence on the parameter used.

METHODS

The framework was developed using a training and testing set of 700 paragraphs assessing 'gout' from a random selection of EHR documents from a tertiary university hospital in Geneva, Switzerland. All paragraphs were manually reviewed and classified by two healthcare professionals into disease (true gout) and non-disease (gold standard). The LLM's accuracy was tested using few-shot and chain-of-thought prompting and compared with a regular expression (regex)-based method, focusing on the effects of model parameters and prompt structure. The framework was further validated on 600 paragraphs assessing 'Calcium Pyrophosphate Deposition Disease (CPPD)'.

RESULTS

The LLM-based algorithm outperformed the regex method, achieving a 92.7% (88.7%-95.4%) positive predictive value, a 96.6% (94.6%-97.8%) negative predictive value and an accuracy of 95.4% (93.6%-96.7%) for gout. In the validation set on CPPD, accuracy was 94.1% (90.2%-97.6%). The LLM framework performed well over a wide range of parameter values.

CONCLUSION

LLMs accurately detected disease diagnoses from EHRs, even in non-English languages. They could facilitate creating large disease registers in any language, improving disease care assessment and patient recruitment for clinical trials.

摘要

目的

我们提出并测试了一个框架，该框架使用最新的大语言模型（LLM），即Meta的Llama - 3 - 8B，来检测法语电子健康记录（EHR）文档中的疾病诊断。具体而言，它专注于检测痛风（法语为“goutte”），这是一个常见的法语术语，除了表示疾病外还有多种含义。该研究将基于大语言模型的框架的性能与传统自然语言处理技术进行了比较，并测试了其对所使用参数的依赖性。

方法

该框架是使用一个训练集和测试集开发的，该训练集和测试集包含从瑞士日内瓦一家三级大学医院随机选取的EHR文档中评估“痛风”的700个段落。所有段落均由两名医疗保健专业人员进行人工审核，并分类为疾病（真正的痛风）和非疾病（金标准）。使用少样本和思维链提示测试了大语言模型的准确性，并与基于正则表达式（regex）的方法进行了比较，重点关注模型参数和提示结构的影响。该框架在评估“焦磷酸钙沉积病（CPPD）”的600个段落上进一步得到了验证。

结果

基于大语言模型的算法优于正则表达式方法，痛风检测的阳性预测值为92.7%（88.7% - 95.4%），阴性预测值为96.6%（94.6% - 97.8%），准确率为95.4%（93.6% - 96.7%）。在CPPD的验证集中，准确率为94.1%（90.2% - 97.6%）。大语言模型框架在广泛的参数值范围内表现良好。

结论

大语言模型能够准确地从电子健康记录中检测疾病诊断，即使是在非英语语言中。它们可以促进以任何语言创建大型疾病登记册，改善疾病护理评估以及临床试验的患者招募。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2305/11664341/3d8fac8a6832/rmdopen-10-4-g001.jpg

相似文献

Large language models for accurate disease detection in electronic health records: the examples of crystal arthropathies.用于电子健康记录中准确疾病检测的大语言模型：以晶体性关节病为例。

RMD Open. 2024 Dec 20;10(4):e005003. doi: 10.1136/rmdopen-2024-005003.

Integrating large language models with human expertise for disease detection in electronic health records.将大语言模型与人类专业知识相结合用于电子健康记录中的疾病检测。

Comput Biol Med. 2025 Jun;191:110161. doi: 10.1016/j.compbiomed.2025.110161. Epub 2025 Apr 7.

Prevalence and incidence of non-gout crystal arthropathy in southern Sweden.瑞典南部非痛风性晶体关节病的患病率和发病率。

Arthritis Res Ther. 2019 Dec 17;21(1):291. doi: 10.1186/s13075-019-2077-6.

Empowering large language models for automated clinical assessment with generation-augmented retrieval and hierarchical chain-of-thought.通过生成增强检索和分层思维链赋能大型语言模型进行自动化临床评估。

Artif Intell Med. 2025 Apr;162:103078. doi: 10.1016/j.artmed.2025.103078. Epub 2025 Feb 12.

Utilizing large language models for detecting hospital-acquired conditions: an empirical study on pulmonary embolism.利用大语言模型检测医院获得性疾病：关于肺栓塞的实证研究

J Am Med Inform Assoc. 2025 May 1;32(5):876-884. doi: 10.1093/jamia/ocaf048.

Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study.用于心理健康预测模型的电子健康记录中非结构化文本分类：大语言模型评估研究

JMIR Med Inform. 2025 Jan 21;13:e65454. doi: 10.2196/65454.

Scalable information extraction from free text electronic health records using large language models.使用大语言模型从自由文本电子健康记录中进行可扩展的信息提取。

BMC Med Res Methodol. 2025 Jan 28;25(1):23. doi: 10.1186/s12874-025-02470-z.

Cartilage icing and chondrocalcinosis on knee radiographs in the differentiation between gout and calcium pyrophosphate deposition.膝关节 X 线片中软骨冰化和软骨钙化为鉴别焦磷酸钙沉积症与痛风。

PLoS One. 2020 Apr 16;15(4):e0231508. doi: 10.1371/journal.pone.0231508. eCollection 2020.

Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation.用于从非结构化和半结构化电子健康记录中提取数据的大语言模型：多模型性能评估

BMJ Health Care Inform. 2025 Jan 19;32(1):e101139. doi: 10.1136/bmjhci-2024-101139.

Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review.2023年以来电子健康记录中用于患者护理的生成式大语言模型：一项系统综述

medRxiv. 2024 Aug 19:2024.08.11.24311828. doi: 10.1101/2024.08.11.24311828.

引用本文的文献

Large language models in clinical nutrition: an overview of its applications, capabilities, limitations, and potential future prospects.临床营养中的大语言模型：其应用、能力、局限性及潜在未来前景概述

Front Nutr. 2025 Aug 7;12:1635682. doi: 10.3389/fnut.2025.1635682. eCollection 2025.

Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.医学诊断中的大语言模型：基于文献计量分析的综述

J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.

Coronary artery calcium and atherosclerotic cardiovascular disease risk scores in patients with calcium pyrophosphate deposition disease.焦磷酸钙沉积病患者的冠状动脉钙化与动脉粥样硬化性心血管疾病风险评分

Rheumatology (Oxford). 2025 May 1;64(5):2836-2841. doi: 10.1093/rheumatology/keae655.

本文引用的文献

Use of a Large Language Model to Identify and Classify Injuries With Free-Text Emergency Department Data.使用大语言模型通过急诊部自由文本数据识别和分类损伤情况。

JAMA Netw Open. 2024 May 1;7(5):e2413208. doi: 10.1001/jamanetworkopen.2024.13208.

Development and validation of a self-updating gout register from electronic health records data.基于电子健康记录数据的自我更新痛风登记册的开发与验证

RMD Open. 2024 Apr 24;10(2):e004120. doi: 10.1136/rmdopen-2024-004120.

A Comparison of a Large Language Model vs Manual Chart Review for the Extraction of Data Elements From the Electronic Health Record.大型语言模型与人工病历审查在从电子健康记录中提取数据元素方面的比较

Gastroenterology. 2024 Apr;166(4):707-709.e3. doi: 10.1053/j.gastro.2023.12.019. Epub 2023 Dec 25.

Performance of Large Language Models on a Neurology Board-Style Examination.大语言模型在神经科 board-style 考试中的表现。

JAMA Netw Open. 2023 Dec 1;6(12):e2346721. doi: 10.1001/jamanetworkopen.2023.46721.

Evaluating the Application of Large Language Models in Clinical Research Contexts.评估大语言模型在临床研究背景下的应用。

JAMA Netw Open. 2023 Oct 2;6(10):e2335924. doi: 10.1001/jamanetworkopen.2023.35924.

Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions.眼科医生与大型语言模型聊天机器人对在线患者眼部护理问题的回复比较。

JAMA Netw Open. 2023 Aug 1;6(8):e2330320. doi: 10.1001/jamanetworkopen.2023.30320.

The 2023 ACR/EULAR Classification Criteria for Calcium Pyrophosphate Deposition Disease.2023 年 ACR/EULAR 焦磷酸钙沉积病分类标准。

Arthritis Rheumatol. 2023 Oct;75(10):1703-1713. doi: 10.1002/art.42619. Epub 2023 Jul 26.

A large language model for electronic health records.用于电子健康记录的大型语言模型。

NPJ Digit Med. 2022 Dec 26;5(1):194. doi: 10.1038/s41746-022-00742-2.

Identification of Gout Flares in Chief Complaint Text Using Natural Language Processing.使用自然语言处理技术在主诉文本中识别痛风发作

AMIA Annu Symp Proc. 2021 Jan 25;2020:973-982. eCollection 2020.

Issues in CPPD Nomenclature and Classification.CPPD 命名和分类问题。

Curr Rheumatol Rep. 2019 Jul 25;21(9):49. doi: 10.1007/s11926-019-0847-4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于电子健康记录中准确疾病检测的大语言模型：以晶体性关节病为例。

Large language models for accurate disease detection in electronic health records: the examples of crystal arthropathies.

作者信息

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献