Butler James J, Acosta Ernesto, Kuna Michael C, Harrington Michael C, Rosenbaum Andrew J, Mulligan Michael T, Kennedy John G
NYU Langone Health, New York, USA.
Albany Medical Center, NY, USA.
Hand (N Y). 2024 Aug 13:15589447241267766. doi: 10.1177/15589447241267766.
The purpose of this study was to assess the effectiveness of an Artificial Intelligence-Large Language Model (AI-LLM) at improving the readability of hand and wrist radiology reports.
The radiology reports of 100 hand and/or wrist radiographs, 100 hand and/or wrist computed tomography (CT) scans, and 100 hand and/or wrist magnetic resonance imaging (MRI) scans were extracted. The following prompt command was inserted into the AI-LLM: "Explain this radiology report to a patient in layman's terms in the second person: [Report Text]." The report length, Flesch reading ease score (FRES), and Flesch-Kincaid reading level (FKRL) were calculated for the original radiology report and the AI-LLM-generated report. The accuracy of the AI-LLM report was assessed via a 5-point Likert scale. Any "hallucination" produced by the AI-LLM-generated report was recorded.
There was a statistically significant improvement in mean FRES scores and FKRL scores in the AI-LLM-generated radiograph report, CT report, and MRI report. For all AI-LLM-generated reports, the mean reading level improved to below an eighth-grade reading level. The mean Likert score for the AI-LLM-generated radiograph report, CT report, and MRI report was 4.1 ± 0.6, 3.9 ± 0.6, and 3.9 ± 0.7, respectively. The hallucination rate in the AI-LLM-generated radiograph report, CT report, and MRI report was 3%, 6%, and 6%, respectively.
This study demonstrates that AI-LLM effectively improves the readability of hand and wrist radiology reports, underscoring the potential application of AI-LLM as a promising and innovative patient-centric strategy to improve patient comprehension of their imaging reports. IV.
本研究的目的是评估人工智能大语言模型(AI-LLM)在提高手部和腕部放射学报告可读性方面的有效性。
提取了100份手部和/或腕部X光片、100份手部和/或腕部计算机断层扫描(CT)以及100份手部和/或腕部磁共振成像(MRI)的放射学报告。将以下提示命令插入AI-LLM:“以第二人称用通俗易懂的语言向患者解释这份放射学报告:[报告文本]。”计算原始放射学报告和AI-LLM生成报告的报告长度、弗莱什易读性分数(FRES)和弗莱什-金凯德阅读等级(FKRL)。通过5点李克特量表评估AI-LLM报告的准确性。记录AI-LLM生成报告产生的任何“幻觉”。
AI-LLM生成的X光片报告、CT报告和MRI报告的平均FRES分数和FKRL分数有统计学意义的提高。对于所有AI-LLM生成的报告,平均阅读等级提高到八年级以下。AI-LLM生成的X光片报告、CT报告和MRI报告的平均李克特分数分别为4.1±0.6、3.9±0.6和3.9±0.7。AI-LLM生成的X光片报告、CT报告和MRI报告的幻觉率分别为3%、6%和6%。
本研究表明,AI-LLM有效地提高了手部和腕部放射学报告的可读性,强调了AI-LLM作为一种有前景的、以患者为中心的创新策略在提高患者对其影像报告理解方面的潜在应用。四、