生成式预训练变换器4对疑似心肌炎的心血管磁共振报告的分析：一项多中心研究。

Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study.

作者信息

Kaya Kenan, Gietzen Carsten, Hahnfeldt Robert, Zoubi Maher, Emrich Tilman, Halfmann Moritz C, Sieren Malte Maria, Elser Yannic, Krumm Patrick, Brendel Jan M, Nikolaou Konstantin, Haag Nina, Borggrefe Jan, Krüchten Ricarda von, Müller-Peltzer Katharina, Ehrengut Constantin, Denecke Timm, Hagendorff Andreas, Goertz Lukas, Gertz Roman J, Bunck Alexander Christian, Maintz David, Persigehl Thorsten, Lennartz Simon, Luetkens Julian A, Jaiswal Astha, Iuga Andra Iza, Pennig Lenhard, Kottlors Jonathan

机构信息

Institute for Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.

出版信息

J Cardiovasc Magn Reson. 2024;26(2):101068. doi: 10.1016/j.jocmr.2024.101068. Epub 2024 Jul 28.

DOI:10.1016/j.jocmr.2024.101068

PMID:39079602

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11414660/

Abstract

BACKGROUND

Diagnosing myocarditis relies on multimodal data, including cardiovascular magnetic resonance (CMR), clinical symptoms, and blood values. The correct interpretation and integration of CMR findings require radiological expertise and knowledge. We aimed to investigate the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model, for report-based medical decision-making in the context of cardiac MRI for suspected myocarditis.

METHODS

This retrospective study includes CMR reports from 396 patients with suspected myocarditis and eight centers, respectively. CMR reports and patient data including blood values, age, and further clinical information were provided to GPT-4 and radiologists with 1 (resident 1), 2 (resident 2), and 4 years (resident 3) of experience in CMR and knowledge of the 2018 Lake Louise Criteria. The final impression of the report regarding the radiological assessment of whether myocarditis is present or not was not provided. The performance of Generative pre-trained transformer 4 (GPT-4) and the human readers were compared to a consensus reading (two board-certified radiologists with 8 and 10 years of experience in CMR). Sensitivity, specificity, and accuracy were calculated.

RESULTS

GPT-4 yielded an accuracy of 83%, sensitivity of 90%, and specificity of 78%, which was comparable to the physician with 1 year of experience (R1: 86%, 90%, 84%, p = 0.14) and lower than that of more experienced physicians (R2: 89%, 86%, 91%, p = 0.007 and R3: 91%, 85%, 96%, p < 0.001). GPT-4 and human readers showed a higher diagnostic performance when results from T1- and T2-mapping sequences were part of the reports, for residents 1 and 3 with statistical significance (p = 0.004 and p = 0.02, respectively).

CONCLUSION

GPT-4 yielded good accuracy for diagnosing myocarditis based on CMR reports in a large dataset from multiple centers and therefore holds the potential to serve as a diagnostic decision-supporting tool in this capacity, particularly for less experienced physicians. Further studies are required to explore the full potential and elucidate educational aspects of the integration of large language models in medical decision-making.

摘要

背景

心肌炎的诊断依赖于多模态数据，包括心血管磁共振成像（CMR）、临床症状和血液指标。CMR检查结果的正确解读和整合需要放射学专业知识。我们旨在研究生成式预训练变换器4（GPT-4）这一大型语言模型在疑似心肌炎心脏磁共振成像背景下基于报告的医疗决策中的表现。

方法

这项回顾性研究分别纳入了来自8个中心的396例疑似心肌炎患者的CMR报告。将CMR报告以及包括血液指标、年龄和其他临床信息在内的患者数据提供给GPT-4以及具有1年（住院医师1）、2年（住院医师2）和4年（住院医师3）CMR经验且了解2018年路易斯湖标准的放射科医生。报告中未给出关于是否存在心肌炎的放射学评估的最终结论。将生成式预训练变换器4（GPT-4）和人类读者的表现与一致性解读（两位具有8年和10年CMR经验的经委员会认证的放射科医生）进行比较。计算敏感性、特异性和准确性。

结果

GPT-4的准确率为83%，敏感性为90%，特异性为78%，与有1年经验的医生（住院医师1：86%、90%、84%，p = 0.14）相当，但低于经验更丰富的医生（住院医师2：89%、86%、91%，p = 0.007；住院医师3：91%、85%、96%，p < 0.001）。当T1和T2映射序列的结果包含在报告中时，GPT-4和人类读者的诊断表现更高，住院医师1和住院医师3的情况具有统计学意义（分别为p = 0.004和p = 0.02）。

结论

在来自多个中心的大型数据集中，GPT-4基于CMR报告诊断心肌炎具有良好的准确率，因此有潜力作为一种诊断决策支持工具，特别是对于经验较少的医生。需要进一步研究以探索其全部潜力，并阐明大型语言模型在医疗决策中整合的教育意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbf3/11414660/27f7edee6bf2/ga1.jpg

相似文献

Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study.生成式预训练变换器4对疑似心肌炎的心血管磁共振报告的分析：一项多中心研究。

J Cardiovasc Magn Reson. 2024;26(2):101068. doi: 10.1016/j.jocmr.2024.101068. Epub 2024 Jul 28.

Evaluation of myocarditis with a free-breathing three-dimensional isotropic whole-heart joint T1 and T2 mapping sequence.使用自由呼吸三维各向同性全心联合T1和T2映射序列评估心肌炎。

J Cardiovasc Magn Reson. 2024;26(2):101100. doi: 10.1016/j.jocmr.2024.101100. Epub 2024 Sep 19.

Free-breathing single-beat exercise cardiovascular magnetic resonance with generative artificial intelligence for evaluation of volumetric and functional cardiac indices: A reproducibility study.利用生成式人工智能进行自由呼吸单节拍运动心血管磁共振成像评估容积和功能心脏指标：一项重复性研究

J Cardiovasc Magn Reson. 2025;27(1):101901. doi: 10.1016/j.jocmr.2025.101901. Epub 2025 Apr 30.

Diagnostic Value of Cardiovascular Magnetic Resonance T1 and T2 Mapping in Acute Myocarditis: A Systematic Literature Review.心血管磁共振 T1 和 T2 映射在急性心肌炎中的诊断价值：系统文献综述。

Medicina (Kaunas). 2024 Jul 18;60(7):1162. doi: 10.3390/medicina60071162.

Practice patterns of cardiovascular magnetic resonance use in the diagnosis of pediatric myocarditis: A survey-based study.心血管磁共振成像在小儿心肌炎诊断中的应用模式：一项基于调查的研究。

J Cardiovasc Magn Reson. 2024;26(2):101091. doi: 10.1016/j.jocmr.2024.101091. Epub 2024 Sep 11.

The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.生成式预训练变换器4（GPT-4）分析三种不同语言医学笔记的潜力：一项回顾性模型评估研究。

Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2.

The diagnostic and triage accuracy of the GPT-3 artificial intelligence model: an observational study.GPT-3 人工智能模型的诊断和分诊准确性：一项观察性研究。

Lancet Digit Health. 2024 Aug;6(8):e555-e561. doi: 10.1016/S2589-7500(24)00097-9.

Comparison of diagnostic algorithms in clinically suspected viral myocarditis: Agreement between cardiovascular magnetic resonance, endomyocardial biopsy, and troponin T.临床疑似病毒性心肌炎诊断算法的比较：心血管磁共振成像、心内膜心肌活检与肌钙蛋白T之间的一致性

J Cardiovasc Magn Reson. 2024;26(2):101087. doi: 10.1016/j.jocmr.2024.101087. Epub 2024 Aug 25.

Enhancing Preoperative Diagnosis of Subscapular Muscle Injuries with Shoulder MRI-based Multimodal Radiomics.基于肩部MRI的多模态放射组学增强肩胛下肌损伤的术前诊断

Acad Radiol. 2025 Feb;32(2):907-915. doi: 10.1016/j.acra.2024.09.049. Epub 2024 Oct 5.

A real-life clinical application of cardiac magnetic resonance imaging in patients with acute myocarditis - one-center observational retrospective study.心脏磁共振成像在急性心肌炎患者中的实际临床应用——单中心观察性回顾性研究。

Cardiol J. 2025;32(1):53-61. doi: 10.5603/cj.97866. Epub 2024 Nov 13.

引用本文的文献

Machine learning for myocarditis diagnosis using cardiovascular magnetic resonance: a systematic review, diagnostic test accuracy meta-analysis, and comparison with human physicians.使用心血管磁共振成像的机器学习用于心肌炎诊断：一项系统评价、诊断试验准确性的Meta分析以及与人类医生的比较

Int J Cardiovasc Imaging. 2025 Sep 9. doi: 10.1007/s10554-025-03497-5.

Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.医学诊断中的大语言模型：基于文献计量分析的综述

J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.

Comparing Diagnostic Accuracy of Clinical Professionals and Large Language Models: Systematic Review and Meta-Analysis.比较临床专业人员和大语言模型的诊断准确性：系统评价与荟萃分析

JMIR Med Inform. 2025 Apr 25;13:e64963. doi: 10.2196/64963.

Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models.用于放射学报告错误检测的大语言模型：闭源模型与符合隐私规定的开源模型的对比分析

Eur Radiol. 2025 Feb 20. doi: 10.1007/s00330-025-11438-y.

[Artificial intelligence in radiology : Literature overview and reading recommendations].[放射学中的人工智能：文献综述与阅读建议]

Radiologie (Heidelb). 2025 Apr;65(4):266-270. doi: 10.1007/s00117-025-01419-z. Epub 2025 Feb 4.

Society for Cardiovascular Magnetic Resonance recommendations toward environmentally sustainable cardiovascular magnetic resonance.心血管磁共振学会关于环境可持续性心血管磁共振的建议。

J Cardiovasc Magn Reson. 2025;27(1):101840. doi: 10.1016/j.jocmr.2025.101840. Epub 2025 Jan 29.

Generative AI and large language models in nuclear medicine: current status and future prospects.生成式人工智能和核医学中的大语言模型：现状与未来展望。

Ann Nucl Med. 2024 Nov;38(11):853-864. doi: 10.1007/s12149-024-01981-x. Epub 2024 Sep 25.

本文引用的文献

Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy.GPT-4 在检测放射科报告错误方面的潜力：对报告准确性的影响。

Radiology. 2024 Apr;311(1):e232714. doi: 10.1148/radiol.232714.

Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand.生成式预训练转换器 4 使得心血管磁共振报告易于理解。

J Cardiovasc Magn Reson. 2024 Summer;26(1):101035. doi: 10.1016/j.jocmr.2024.101035. Epub 2024 Mar 7.

Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports.评估生成式预训练变换器4（GPT-4）在规范放射学报告方面的性能。

Eur Radiol. 2024 Jun;34(6):3566-3574. doi: 10.1007/s00330-023-10384-x. Epub 2023 Nov 8.

The future landscape of large language models in medicine.医学领域大语言模型的未来前景。

Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.

Feasibility of Differential Diagnosis Based on Imaging Patterns Using a Large Language Model.基于成像模式利用大语言模型进行鉴别诊断的可行性

Radiology. 2023 Jul;308(1):e231167. doi: 10.1148/radiol.231167.

GPT-4 for Automated Determination of Radiological Study and Protocol based on Radiology Request Forms: A Feasibility Study.基于放射学申请单的放射学研究和方案的 GPT-4 自动确定：一项可行性研究。

Radiology. 2023 Jun;307(5):e230877. doi: 10.1148/radiol.230877.

Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study.利用GPT-4将自由文本放射学报告进行事后转换为结构化报告：一项多语言可行性研究。

Radiology. 2023 May;307(4):e230725. doi: 10.1148/radiol.230725. Epub 2023 Apr 4.

ChatGPT: friend or foe?ChatGPT：朋友还是敌人？

Lancet Digit Health. 2023 Mar;5(3):e102. doi: 10.1016/S2589-7500(23)00023-7. Epub 2023 Feb 6.

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试（USMLE）中的表现如何？大语言模型对医学教育和知识评估的影响。

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

ChatGPT and the Future of Medical Writing.ChatGPT与医学写作的未来。

Radiology. 2023 Apr;307(2):e223312. doi: 10.1148/radiol.223312. Epub 2023 Feb 2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生成式预训练变换器4对疑似心肌炎的心血管磁共振报告的分析：一项多中心研究。

Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献