Department of Radiology, King's College Hospital London, Dubai, United Arab Emirates.
Department of Radiology, Eskişehir Osmangazi University, Eskişehir, Turkiye; Department of Medical Education, Gazi University, Ankara, Turkiye.
Patient Educ Couns. 2024 Sep;126:108307. doi: 10.1016/j.pec.2024.108307. Epub 2024 May 3.
OBJECTIVE: Evaluate Artificial Intelligence (AI) language models (ChatGPT-4, BARD, Microsoft Copilot) in simplifying radiology reports, assessing readability, understandability, actionability, and urgency classification. METHODS: This study evaluated the effectiveness of these AI models in translating radiology reports into patient-friendly language and providing understandable and actionable suggestions and urgency classifications. Thirty radiology reports were processed using AI tools, and their outputs were assessed for readability (Flesch Reading Ease, Flesch-Kincaid Grade Level), understandability (PEMAT), and the accuracy of urgency classification. ANOVA and Chi-Square tests were performed to compare the models' performances. RESULTS: All three AI models successfully transformed medical jargon into more accessible language, with BARD showing superior readability scores. In terms of understandability, all models achieved scores above 70%, with ChatGPT-4 and BARD leading (p < 0.001, both). However, the AI models varied in accuracy of urgency recommendations, with no significant statistical difference (p = 0.284). CONCLUSION: AI language models have proven effective in simplifying radiology reports, thereby potentially improving patient comprehension and engagement in their health decisions. However, their accuracy in assessing the urgency of medical conditions based on radiology reports suggests a need for further refinement. PRACTICE IMPLICATIONS: Incorporating AI in radiology communication can empower patients, but further development is crucial for comprehensive and actionable patient support.
目的:评估人工智能(AI)语言模型(ChatGPT-4、BARD、Microsoft Copilot)在简化放射科报告、评估可读性、可理解性、可操作性和紧急情况分类方面的效果。
方法:本研究评估了这些 AI 模型将放射科报告转化为患者友好语言的有效性,并提供可理解和可操作的建议以及紧急情况分类。使用 AI 工具处理了 30 份放射科报告,并评估其输出的可读性(Flesch 阅读容易度、Flesch-Kincaid 年级水平)、可理解性(PEMAT)和紧急情况分类的准确性。进行了方差分析和卡方检验来比较模型的性能。
结果:所有三种 AI 模型都成功地将医学术语转化为更易懂的语言,BARD 的可读性得分更高。在可理解性方面,所有模型的得分均高于 70%,ChatGPT-4 和 BARD 领先(p<0.001,均)。然而,AI 模型在紧急情况推荐的准确性方面存在差异,无统计学显著差异(p=0.284)。
结论:AI 语言模型已被证明可有效简化放射科报告,从而有可能提高患者对其健康决策的理解和参与度。然而,它们根据放射科报告评估医疗状况紧急程度的准确性表明需要进一步改进。
实践意义:将 AI 纳入放射科沟通中可以增强患者的能力,但需要进一步开发,以提供全面且可操作的患者支持。
Int J Med Inform. 2024-11
J Med Internet Res. 2024-8-14
Knee Surg Sports Traumatol Arthrosc. 2024-5
BMC Med Inform Decis Mak. 2025-9-1
Diagnostics (Basel). 2025-6-26
BMC Oral Health. 2025-2-1
Medicine (Baltimore). 2025-1-10