Mugu Vamshi K, Carr Brendan M, Olson Mike C, Schupbach John C, Eguia Francisco A, Schmitz John J, Khandelwal Ashish
Department of Radiology.
Department of Emergency Medicine, Mayo Clinic, Rochester, MN.
Ultrasound Q. 2024 Dec 17;41(1). doi: 10.1097/RUQ.0000000000000699. eCollection 2025 Mar 1.
Incidental findings in diagnostic imaging are common, but follow-up recommendations often lack consistency. The Society of Radiologists in Ultrasound (SRU) issued guidelines in 2021 for managing incidentally detected gallbladder polyps, aiming to balance follow-up with avoiding overtreatment. There is variable adherence to these guidelines in radiology reports, however, which makes it difficult for the clinician to pursue appropriate follow-up for the patient. The purpose of this project is to test the feasibility of a Large Language Model (LLM)-based tool to incorporate SRU guidelines into radiology reports. Additionally, we propose a framework for closely integrating societal follow-up recommendations into radiology reports, using this tool as an example.Following institutional review board approval, we retrospectively reviewed gallbladder ultrasound examinations performed on adult ED patients in 2022. Data on patient demographics and radiology report content were collected. Using the 2021 SRU guidelines, we developed an interactive tool employing a retriever-augmented generator (RAG) and prompt engineering. A board-certified radiologist tested the accuracy, whereas a board-certified emergency medicine physician assessed the clarity and consistency of the recommendations.The interactive tool, GB-PRL, outperformed leading closed-source and open-source LLMs, achieving 100% accuracy in risk categorization and follow-up recommendations on hypothetical user queries (P < 0.001). The tool also showed superior accuracy compared to radiology reports on retrospective data (P = 0.04). Although GB-PRL demonstrated greater clarity and consistency, the improvement from the radiology reports was not statistically significant (P = 0.22). Further work is needed for prospective testing of GB-PRL before integrating it into clinical practice.
诊断成像中的偶然发现很常见,但后续随访建议往往缺乏一致性。超声放射学会(SRU)在2021年发布了关于处理偶然发现的胆囊息肉的指南,旨在平衡随访与避免过度治疗。然而,放射学报告对这些指南的遵循情况各不相同,这使得临床医生难以对患者进行适当的随访。本项目的目的是测试一种基于大语言模型(LLM)的工具将SRU指南纳入放射学报告的可行性。此外,我们以该工具为例,提出了一个将社会随访建议紧密整合到放射学报告中的框架。在获得机构审查委员会批准后,我们回顾性地审查了2022年对成年急诊患者进行的胆囊超声检查。收集了患者人口统计学数据和放射学报告内容。利用2021年SRU指南,我们开发了一种采用检索增强生成器(RAG)和提示工程的交互式工具。一名获得委员会认证的放射科医生测试了其准确性,而一名获得委员会认证的急诊医学医生评估了建议的清晰度和一致性。交互式工具GB-PRL的表现优于领先的闭源和开源大语言模型,在对假设用户查询的风险分类和随访建议方面达到了100%的准确率(P < 0.001)。与回顾性数据的放射学报告相比,该工具也显示出更高的准确率(P = 0.04)。尽管GB-PRL表现出更高的清晰度和一致性,但与放射学报告相比的改善在统计学上并不显著(P = 0.22)。在将GB-PRL整合到临床实践之前,需要进行前瞻性测试的进一步工作。