• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型在简化放射科报告印象方面的定量评估:一项多模态回顾性分析。

Quantitative Evaluation of Large Language Models to Streamline Radiology Report Impressions: A Multimodal Retrospective Analysis.

机构信息

From the Yale School of Medicine (R.D., P.K.) and Department of Radiology and Biomedical Imaging (K.S.A., S.S.B., S.C., H.P.F.), Yale School of Medicine, 333 Cedar St, New Haven, CT 06510; Yale School of Management, New Haven, Conn (H.P.F.); and Department of Health Policy and Management, Yale School of Public Health, New Haven, Conn (H.P.F.).

出版信息

Radiology. 2024 Mar;310(3):e231593. doi: 10.1148/radiol.231593.

DOI:10.1148/radiol.231593
PMID:38530171
Abstract

Background The complex medical terminology of radiology reports may cause confusion or anxiety for patients, especially given increased access to electronic health records. Large language models (LLMs) can potentially simplify radiology report readability. Purpose To compare the performance of four publicly available LLMs (ChatGPT-3.5 and ChatGPT-4, Bard [now known as Gemini], and Bing) in producing simplified radiology report impressions. Materials and Methods In this retrospective comparative analysis of the four LLMs (accessed July 23 to July 26, 2023), the Medical Information Mart for Intensive Care (MIMIC)-IV database was used to gather 750 anonymized radiology report impressions covering a range of imaging modalities (MRI, CT, US, radiography, mammography) and anatomic regions. Three distinct prompts were employed to assess the LLMs' ability to simplify report impressions. The first prompt (prompt 1) was "Simplify this radiology report." The second prompt (prompt 2) was "I am a patient. Simplify this radiology report." The last prompt (prompt 3) was "Simplify this radiology report at the 7th grade level." Each prompt was followed by the radiology report impression and was queried once. The primary outcome was simplification as assessed by readability score. Readability was assessed using the average of four established readability indexes. The nonparametric Wilcoxon signed-rank test was applied to compare reading grade levels across LLM output. Results All four LLMs simplified radiology report impressions across all prompts tested ( < .001). Within prompts, differences were found between LLMs. Providing the context of being a patient or requesting simplification at the seventh-grade level reduced the reading grade level of output for all models and prompts (except prompt 1 to prompt 2 for ChatGPT-4) ( < .001). Conclusion Although the success of each LLM varied depending on the specific prompt wording, all four models simplified radiology report impressions across all modalities and prompts tested. © RSNA, 2024 See also the editorial by Rahsepar in this issue.

摘要

背景 放射学报告中复杂的医学术语可能会使患者感到困惑或焦虑,尤其是考虑到电子健康记录的使用增加。大型语言模型(LLM)有可能简化放射学报告的可读性。目的 比较四种市售 LLM(ChatGPT-3.5 和 ChatGPT-4、Bard[现称为 Gemini]和 Bing)在生成简化放射学报告印象方面的性能。材料与方法 在这项对四种 LLM(于 2023 年 7 月 23 日至 7 月 26 日检索)的回顾性比较分析中,使用 Medical Information Mart for Intensive Care(MIMIC)-IV 数据库收集了涵盖多种成像方式(MRI、CT、US、放射线照相术、乳房 X 线摄影术)和解剖区域的 750 份匿名放射学报告印象。使用三个不同的提示来评估 LLM 简化报告印象的能力。第一个提示(提示 1)是“简化这份放射学报告。”第二个提示(提示 2)是“我是一名患者。简化这份放射学报告。”最后一个提示(提示 3)是“将这份放射学报告简化到 7 年级水平。”每个提示后都跟有放射学报告印象,并查询一次。主要结局是简化程度,通过可读性评分来评估。使用四个既定可读性指标的平均值来评估可读性。应用非参数 Wilcoxon 符号秩检验比较 LLM 输出的阅读年级水平。结果 所有四种 LLM 在所有测试的提示中都简化了放射学报告印象(<.001)。在提示内,发现 LLM 之间存在差异。提供患者背景或要求在 7 年级水平简化,减少了所有模型和提示的输出阅读年级水平(除了 ChatGPT-4 的提示 1 到提示 2)(<.001)。结论 尽管每个 LLM 的成功程度取决于特定的提示措辞,但所有四种模型在所有测试的模式和提示中都简化了放射学报告印象。

相似文献

1
Quantitative Evaluation of Large Language Models to Streamline Radiology Report Impressions: A Multimodal Retrospective Analysis.大语言模型在简化放射科报告印象方面的定量评估:一项多模态回顾性分析。
Radiology. 2024 Mar;310(3):e231593. doi: 10.1148/radiol.231593.
2
Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用:定性研究。
JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.
3
Large language models and bariatric surgery patient education: a comparative readability analysis of GPT-3.5, GPT-4, Bard, and online institutional resources.大型语言模型和减重手术患者教育:GPT-3.5、GPT-4、Bard 与在线机构资源的可读性比较分析。
Surg Endosc. 2024 May;38(5):2522-2532. doi: 10.1007/s00464-024-10720-2. Epub 2024 Mar 12.
4
Large language models: a new frontier in paediatric cataract patient education.大语言模型:小儿白内障患者教育的新前沿。
Br J Ophthalmol. 2024 Sep 20;108(10):1470-1476. doi: 10.1136/bjo-2024-325252.
5
Using Large Language Models to Generate Educational Materials on Childhood Glaucoma.利用大语言模型生成儿童青光眼教育材料。
Am J Ophthalmol. 2024 Sep;265:28-38. doi: 10.1016/j.ajo.2024.04.004. Epub 2024 Apr 16.
6
Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations.利用人工智能在减重手术中的应用:ChatGPT-4、Bing 和 Bard 在生成临床医生水平的减重手术建议方面的比较分析。
Surg Obes Relat Dis. 2024 Jul;20(7):603-608. doi: 10.1016/j.soard.2024.03.011. Epub 2024 Mar 24.
7
Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models.利用生成式人工智能辅助学习罕见且复杂的诊断:对流行的大型语言模型的定性研究。
JMIR Med Educ. 2024 Feb 13;10:e51391. doi: 10.2196/51391.
8
Enhancing readability of USFDA patient communications through large language models: a proof-of-concept study.通过大型语言模型提高美国 FDA 患者通讯的可读性:概念验证研究。
Expert Rev Clin Pharmacol. 2024 Aug;17(8):731-741. doi: 10.1080/17512433.2024.2363840. Epub 2024 Jun 4.
9
From jargon to clarity: Improving the readability of foot and ankle radiology reports with an artificial intelligence large language model.从行话到清晰明了:利用人工智能大语言模型提高足踝放射学报告的可读性
Foot Ankle Surg. 2024 Jun;30(4):331-337. doi: 10.1016/j.fas.2024.01.008. Epub 2024 Feb 5.
10
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.评估生成式 AI 大语言模型 ChatGPT、Google Bard 和 Microsoft Bing Chat 在支持循证牙科方面的性能:比较混合方法研究。
J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.

引用本文的文献

1
Assessing the ability of large language models to simplify lumbar spine imaging reports into patient-facing text: a pilot study of GPT-4.评估大语言模型将腰椎影像报告简化为面向患者文本的能力:一项关于GPT-4的初步研究
Skeletal Radiol. 2025 Sep 9. doi: 10.1007/s00256-025-05027-9.
2
Development, optimization, and preliminary evaluation of a novel artificial intelligence tool to promote patient health literacy in radiology reports: The Rads-Lit tool.一种用于提高放射学报告中患者健康素养的新型人工智能工具的开发、优化及初步评估:Rads-Lit工具
PLoS One. 2025 Sep 3;20(9):e0331368. doi: 10.1371/journal.pone.0331368. eCollection 2025.
3
Performance and improvement strategies for adapting generative large language models for electronic health record applications: A systematic review.
将生成式大语言模型应用于电子健康记录的性能及改进策略:一项系统综述
Int J Med Inform. 2025 Aug 28;205:106091. doi: 10.1016/j.ijmedinf.2025.106091.
4
Evaluating the Quality and Understandability of Radiology Report Summaries Generated by ChatGPT: Survey Study.评估ChatGPT生成的放射学报告摘要的质量和可理解性:调查研究
JMIR Form Res. 2025 Aug 27;9:e76097. doi: 10.2196/76097.
5
Structured Transformation of Unstructured Prostate MRI Reports Using Large Language Models.使用大语言模型对非结构化前列腺MRI报告进行结构化转换
Tomography. 2025 Jun 17;11(6):69. doi: 10.3390/tomography11060069.
6
Large Language Models in Medicine: Applications, Challenges, and Future Directions.医学领域的大语言模型:应用、挑战与未来方向。
Int J Med Sci. 2025 May 31;22(11):2792-2801. doi: 10.7150/ijms.111780. eCollection 2025.
7
Detecting New Lesions Using a Large Language Model: Applications in Real-World Multiple Sclerosis Datasets.使用大语言模型检测新病变:在真实世界多发性硬化症数据集中的应用
Ann Neurol. 2025 Aug;98(2):308-316. doi: 10.1002/ana.27251. Epub 2025 Apr 25.
8
Enhancing Physician-Patient Communication in Oncology Using GPT-4 Through Simplified Radiology Reports: Multicenter Quantitative Study.通过简化放射学报告利用GPT-4加强肿瘤学领域医患沟通:多中心定量研究
J Med Internet Res. 2025 Apr 17;27:e63786. doi: 10.2196/63786.
9
Patients' Attitudes to Magnetic Resonance Imaging in Perianal Fistulizing Crohn's Disease: A Global Survey.肛周瘘管性克罗恩病患者对磁共振成像的态度:一项全球调查。
Crohns Colitis 360. 2025 Mar 12;7(2):otaf015. doi: 10.1093/crocol/otaf015. eCollection 2025 Apr.
10
Large Language Models in Summarizing Radiology Report Impressions for Lung Cancer in Chinese: Evaluation Study.大型语言模型对中文肺癌放射学报告印象的总结:评估研究
J Med Internet Res. 2025 Apr 3;27:e65547. doi: 10.2196/65547.