Department of Radiology, LMU University Hospital, LMU Munich, Munich, Germany.
Comprehensive Pneumology Center (CPC-M), Member of the German Center for Lung Research (DZL), Munich, Germany.
Eur Radiol. 2024 May;34(5):2817-2825. doi: 10.1007/s00330-023-10213-1. Epub 2023 Oct 5.
OBJECTIVES: To assess the quality of simplified radiology reports generated with the large language model (LLM) ChatGPT and to discuss challenges and chances of ChatGPT-like LLMs for medical text simplification. METHODS: In this exploratory case study, a radiologist created three fictitious radiology reports which we simplified by prompting ChatGPT with "Explain this medical report to a child using simple language." In a questionnaire, we tasked 15 radiologists to rate the quality of the simplified radiology reports with respect to their factual correctness, completeness, and potential harm for patients. We used Likert scale analysis and inductive free-text categorization to assess the quality of the simplified reports. RESULTS: Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed relevant medical information, and potentially harmful passages were reported. CONCLUSION: While we see a need for further adaption to the medical field, the initial insights of this study indicate a tremendous potential in using LLMs like ChatGPT to improve patient-centered care in radiology and other medical domains. CLINICAL RELEVANCE STATEMENT: Patients have started to use ChatGPT to simplify and explain their medical reports, which is expected to affect patient-doctor interaction. This phenomenon raises several opportunities and challenges for clinical routine. KEY POINTS: • Patients have started to use ChatGPT to simplify their medical reports, but their quality was unknown. • In a questionnaire, most participating radiologists overall asserted good quality to radiology reports simplified with ChatGPT. However, they also highlighted a notable presence of errors, potentially leading patients to draw harmful conclusions. • Large language models such as ChatGPT have vast potential to enhance patient-centered care in radiology and other medical domains. To realize this potential while minimizing harm, they need supervision by medical experts and adaption to the medical field.
目的:评估大型语言模型(LLM)ChatGPT 生成的简化放射学报告的质量,并讨论 ChatGPT 类 LLM 用于医学文本简化的挑战和机遇。
方法:在这项探索性案例研究中,一名放射科医生创建了三个虚构的放射学报告,我们通过向 ChatGPT 提示“用简单的语言向孩子解释这份医学报告”来简化这些报告。在一份问卷中,我们要求 15 名放射科医生根据报告的事实准确性、完整性和对患者的潜在危害来评价简化后的放射学报告的质量。我们使用李克特量表分析和归纳自由文本分类来评估简化报告的质量。
结果:大多数放射科医生认为简化报告在事实、完整性和对患者的潜在危害方面是正确的。然而,也有报告错误陈述、遗漏相关医学信息和潜在有害段落的情况。
结论:尽管我们需要进一步适应医学领域,但这项研究的初步结果表明,使用 ChatGPT 等 LLM 来改善放射学和其他医学领域的以患者为中心的护理具有巨大的潜力。
临床相关性声明:患者已经开始使用 ChatGPT 来简化和解释他们的医疗报告,这预计将影响医患互动。这一现象为临床常规带来了若干机遇和挑战。
要点:
J Cardiovasc Magn Reson. 2024
Curr Probl Diagn Radiol. 2024
Int J Ophthalmol. 2025-9-18
Radiologie (Heidelb). 2025-8-24
J Thorac Imaging. 2021-11-1
Bioinformatics. 2020-2-15
AJR Am J Roentgenol. 2019-1-8
J Digit Imaging. 2016-8