Department of Radiology, Research Institute of Radiological Science, and Center for Clinical Imaging Data Science (CCIDS), Yonsei University College of Medicine, 50-1 Yonsei-Ro, Seodaemun-Gu, Seoul, 03722, South Korea.
Institute for Innovation in Digital Healthcare, Yonsei University, Seoul, South Korea.
Sci Rep. 2024 Jun 8;14(1):13218. doi: 10.1038/s41598-024-63824-z.
The purposes were to assess the efficacy of AI-generated radiology reports in terms of report summary, patient-friendliness, and recommendations and to evaluate the consistent performance of report quality and accuracy, contributing to the advancement of radiology workflow. Total 685 spine MRI reports were retrieved from our hospital database. AI-generated radiology reports were generated in three formats: (1) summary reports, (2) patient-friendly reports, and (3) recommendations. The occurrence of artificial hallucinations was evaluated in the AI-generated reports. Two radiologists conducted qualitative and quantitative assessments considering the original report as a standard reference. Two non-physician raters assessed their understanding of the content of original and patient-friendly reports using a 5-point Likert scale. The scoring of the AI-generated radiology reports were overall high average scores across all three formats. The average comprehension score for the original report was 2.71 ± 0.73, while the score for the patient-friendly reports significantly increased to 4.69 ± 0.48 (p < 0.001). There were 1.12% artificial hallucinations and 7.40% potentially harmful translations. In conclusion, the potential benefits of using generative AI assistants to generate these reports include improved report quality, greater efficiency in radiology workflow for producing summaries, patient-centered reports, and recommendations, and a move toward patient-centered radiology.
目的是评估人工智能生成的放射学报告在报告摘要、患者友好性和建议方面的疗效,并评估报告质量和准确性的一致性能,为放射学工作流程的推进做出贡献。从我们医院的数据库中检索了 685 份脊柱 MRI 报告。人工智能生成的放射学报告有三种格式:(1)摘要报告,(2)患者友好型报告,(3)建议。评估人工智能生成的报告中是否存在人工幻觉。两名放射科医生进行了定性和定量评估,以原始报告为标准参考。两名非医生评估员使用 5 分李克特量表评估他们对原始报告和患者友好型报告内容的理解程度。三种格式的 AI 生成放射学报告的评分均为平均高分。原始报告的平均理解得分为 2.71±0.73,而患者友好型报告的得分显著提高至 4.69±0.48(p<0.001)。有 1.12%的人工幻觉和 7.40%的潜在有害翻译。总之,使用生成式人工智能助手生成这些报告的潜在好处包括提高报告质量、提高放射科工作流程生成摘要、以患者为中心的报告和建议的效率,并朝着以患者为中心的放射学方向发展。
Knee Surg Sports Traumatol Arthrosc. 2024-5
JAMA Netw Open. 2023-10-2
Radiologie (Heidelb). 2024-10
Yale J Biol Med. 2023-9
Bioengineering (Basel). 2025-6-25
J Nucl Cardiol. 2025-7-5
Abdom Radiol (NY). 2025-6-27
Diagnostics (Basel). 2025-4-30