• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

生成式预训练转换器 4 使得心血管磁共振报告易于理解。

Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand.

机构信息

Department of Diagnostic and Interventional Radiology, University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany; Quantitative Imaging Lab Bonn (QILaB), University Hospital Bonn, Venusberg-Campus 1, 53127 Bonn, Germany.

University Hospital Bonn, Department of Medical Biometry, Informatics, and Epidemiology, Venusberg-Campus 1, 53127 Bonn, Germany.

出版信息

J Cardiovasc Magn Reson. 2024 Summer;26(1):101035. doi: 10.1016/j.jocmr.2024.101035. Epub 2024 Mar 7.

DOI:10.1016/j.jocmr.2024.101035
PMID:38460841
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10981113/
Abstract

BACKGROUND

Patients are increasingly using Generative Pre-trained Transformer 4 (GPT-4) to better understand their own radiology findings.

PURPOSE

To evaluate the performance of GPT-4 in transforming cardiovascular magnetic resonance (CMR) reports into text that is comprehensible to medical laypersons.

METHODS

ChatGPT with GPT-4 architecture was used to generate three different explained versions of 20 various CMR reports (n = 60) using the same prompt: "Explain the radiology report in a language understandable to a medical layperson". Two cardiovascular radiologists evaluated understandability, factual correctness, completeness of relevant findings, and lack of potential harm, while 13 medical laypersons evaluated the understandability of the original and the GPT-4 reports on a Likert scale (1 "strongly disagree", 5 "strongly agree"). Readability was measured using the Automated Readability Index (ARI). Linear mixed-effects models (values given as median [interquartile range]) and intraclass correlation coefficient (ICC) were used for statistical analysis.

RESULTS

GPT-4 reports were generated on average in 52 s ± 13. GPT-4 reports achieved a lower ARI score (10 [9-12] vs 5 [4-6]; p < 0.001) and were subjectively easier to understand for laypersons than original reports (1 [1] vs 4 [4,5]; p < 0.001). Eighteen out of 20 (90%) standard CMR reports and 2/60 (3%) GPT-generated reports had an ARI score corresponding to the 8th grade level or higher. Radiologists' ratings of the GPT-4 reports reached high levels for correctness (5 [4, 5]), completeness (5 [5]), and lack of potential harm (5 [5]); with "strong agreement" for factual correctness in 94% (113/120) and completeness of relevant findings in 81% (97/120) of reports. Test-retest agreement for layperson understandability ratings between the three simplified reports generated from the same original report was substantial (ICC: 0.62; p < 0.001). Interrater agreement between radiologists was almost perfect for lack of potential harm (ICC: 0.93, p < 0.001) and moderate to substantial for completeness (ICC: 0.76, p < 0.001) and factual correctness (ICC: 0.55, p < 0.001).

CONCLUSION

GPT-4 can reliably transform complex CMR reports into more understandable, layperson-friendly language while largely maintaining factual correctness and completeness, and can thus help convey patient-relevant radiology information in an easy-to-understand manner.

摘要

背景

患者越来越多地使用生成式预训练转换器 4(GPT-4)来更好地了解自己的放射学发现。

目的

评估 GPT-4 将心血管磁共振(CMR)报告转换为医学外行易懂的文本的性能。

方法

使用具有 GPT-4 架构的 ChatGPT 生成了 20 种不同 CMR 报告的三种不同解释版本(n=60),使用相同的提示:“用医学外行易懂的语言解释放射学报告”。两位心血管放射科医生评估了易懂性、事实正确性、相关发现的完整性以及潜在危害的缺失,而 13 名医学外行则使用李克特量表(1“强烈不同意”,5“强烈同意”)对原始报告和 GPT-4 报告的易懂性进行了评估。使用自动化可读性指数(ARI)测量可读性。使用线性混合效应模型(给出中位数[四分位数范围]的值)和组内相关系数(ICC)进行统计分析。

结果

GPT-4 报告的生成平均用时 52 秒±13 秒。GPT-4 报告的 ARI 得分较低(10 [9-12] 与 5 [4-6];p<0.001),并且外行比原始报告更容易理解(1 [1] 与 4 [4,5];p<0.001)。20 份标准 CMR 报告中的 18 份(90%)和 60 份 GPT 生成报告中的 2 份(3%)的 ARI 得分对应 8 年级或更高年级的水平。放射科医生对 GPT-4 报告的正确性(5 [4,5])、完整性(5 [5])和潜在危害缺失(5 [5])的评分很高;94%(113/120)的报告和 81%(97/120)的报告对事实正确性的评价为“强烈同意”。对同一原始报告生成的三个简化报告,外行易懂性评分的测试-再测试一致性为中等至高(ICC:0.62;p<0.001)。放射科医生之间在缺乏潜在危害方面的一致性几乎是完美的(ICC:0.93,p<0.001),在完整性(ICC:0.76,p<0.001)和事实正确性(ICC:0.55,p<0.001)方面的一致性为中度到高度。

结论

GPT-4 可以可靠地将复杂的 CMR 报告转换为更易于理解的、面向医学外行的语言,同时在很大程度上保持事实正确性和完整性,从而帮助以易于理解的方式传达与患者相关的放射学信息。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/b575357936c8/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/b6a6ba6d8694/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/4ab6f2d02d4c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/682ef7211743/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/4afd73c8eadc/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/55a3b44f5b11/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/b575357936c8/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/b6a6ba6d8694/ga1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/4ab6f2d02d4c/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/682ef7211743/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/4afd73c8eadc/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/55a3b44f5b11/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22c6/10981113/b575357936c8/gr5.jpg

相似文献

1
Generative Pre-trained Transformer 4 makes cardiovascular magnetic resonance reports easy to understand.生成式预训练转换器 4 使得心血管磁共振报告易于理解。
J Cardiovasc Magn Reson. 2024 Summer;26(1):101035. doi: 10.1016/j.jocmr.2024.101035. Epub 2024 Mar 7.
2
Generative Pre-trained Transformer 4 analysis of cardiovascular magnetic resonance reports in suspected myocarditis: A multicenter study.生成式预训练变换器4对疑似心肌炎的心血管磁共振报告的分析:一项多中心研究。
J Cardiovasc Magn Reson. 2024;26(2):101068. doi: 10.1016/j.jocmr.2024.101068. Epub 2024 Jul 28.
3
Accuracy, readability, and understandability of large language models for prostate cancer information to the public.大语言模型向公众提供前列腺癌信息的准确性、可读性和可理解性。
Prostate Cancer Prostatic Dis. 2024 May 14. doi: 10.1038/s41391-024-00826-y.
4
Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study.生成式语言模型在个性化医疗信息方面的评估:工具验证研究
JMIR AI. 2024 Aug 13;3:e54371. doi: 10.2196/54371.
5
Probing clarity: AI-generated simplified breast imaging reports for enhanced patient comprehension powered by ChatGPT-4o.探索清晰度:由 ChatGPT-4o 提供支持的人工智能生成的简化乳腺成像报告,以增强患者理解。
Eur Radiol Exp. 2024 Oct 30;8(1):124. doi: 10.1186/s41747-024-00526-1.
6
PRECISE framework: Enhanced radiology reporting with GPT for improved readability, reliability, and patient-centered care.PRECISE框架:利用GPT增强放射学报告,以提高可读性、可靠性和以患者为中心的护理水平。
Eur J Radiol. 2025 Jun;187:112124. doi: 10.1016/j.ejrad.2025.112124. Epub 2025 Apr 17.
7
Enhanced PROcedural Information READability for Patient-Centered Care in Interventional Radiology With Large Language Models (PRO-READ IR).利用大语言模型提高介入放射学中以患者为中心的护理的程序信息可读性(PRO-READ IR)。
J Am Coll Radiol. 2025 Jan;22(1):84-97. doi: 10.1016/j.jacr.2024.08.010. Epub 2024 Aug 30.
8
Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.基于生成式预训练转换器的自动化放射学报告生成的初步评估:与放射科医生生成的报告进行比较。
Jpn J Radiol. 2024 Feb;42(2):190-200. doi: 10.1007/s11604-023-01487-y. Epub 2023 Sep 15.
9
Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性:与GPT-3.5和GPT-4的比较研究
JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.
10
Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors.基于GPT-4的ChatGPT与放射科医生在使用脑肿瘤真实世界放射学报告方面的诊断性能比较分析。
Eur Radiol. 2025 Apr;35(4):1938-1947. doi: 10.1007/s00330-024-11032-8. Epub 2024 Aug 28.

引用本文的文献

1
Automated cardiac magnetic resonance interpretation derived from prompted large language models.源自提示大语言模型的自动心脏磁共振解读。
Cardiovasc Diagn Ther. 2025 Aug 30;15(4):726-737. doi: 10.21037/cdt-2025-112. Epub 2025 Aug 28.
2
Assessing the ability of large language models to simplify lumbar spine imaging reports into patient-facing text: a pilot study of GPT-4.评估大语言模型将腰椎影像报告简化为面向患者文本的能力:一项关于GPT-4的初步研究
Skeletal Radiol. 2025 Sep 9. doi: 10.1007/s00256-025-05027-9.
3
Development, optimization, and preliminary evaluation of a novel artificial intelligence tool to promote patient health literacy in radiology reports: The Rads-Lit tool.
一种用于提高放射学报告中患者健康素养的新型人工智能工具的开发、优化及初步评估:Rads-Lit工具
PLoS One. 2025 Sep 3;20(9):e0331368. doi: 10.1371/journal.pone.0331368. eCollection 2025.
4
Intra-axial primary brain tumor differentiation: comparing large language models on structured MRI reports vs. radiologists on images.轴内原发性脑肿瘤鉴别:比较基于结构化MRI报告的大语言模型与阅片放射科医生的表现
Eur Radiol. 2025 Aug 22. doi: 10.1007/s00330-025-11924-3.
5
Chatbots in Radiology: Current Applications, Limitations and Future Directions of ChatGPT in Medical Imaging.放射学中的聊天机器人:ChatGPT在医学成像中的当前应用、局限性及未来方向
Diagnostics (Basel). 2025 Jun 26;15(13):1635. doi: 10.3390/diagnostics15131635.
6
Improving the Readability of Institutional Heart Failure-Related Patient Education Materials Using GPT-4: Observational Study.使用GPT-4提高机构性心力衰竭相关患者教育材料的可读性:观察性研究
JMIR Cardio. 2025 Jul 8;9:e68817. doi: 10.2196/68817.
7
Improving radiology reporting accuracy: use of GPT-4 to reduce errors in reports.提高放射学报告准确性:使用GPT-4减少报告中的错误。
Abdom Radiol (NY). 2025 Jun 27. doi: 10.1007/s00261-025-05079-4.
8
Evaluation of a large language model to simplify discharge summaries and provide cardiological lifestyle recommendations.评估大型语言模型以简化出院小结并提供心脏科生活方式建议。
Commun Med (Lond). 2025 May 29;5(1):208. doi: 10.1038/s43856-025-00927-2.
9
Reply to the Letter to the Editor: Improving generative-AI performance in radiology through test-time compute.致编辑的信的回复:通过测试时计算提高放射学中生成式人工智能的性能。
Eur Radiol. 2025 May 6. doi: 10.1007/s00330-025-11665-3.
10
Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models.用于放射学报告错误检测的大语言模型:闭源模型与符合隐私规定的开源模型的对比分析
Eur Radiol. 2025 Feb 20. doi: 10.1007/s00330-025-11438-y.