• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用ChatGPT大型语言模型从放射学报告的自由文本印象中提取额外影像学检查建议的详细信息。

Use of ChatGPT Large Language Models to Extract Details of Recommendations for Additional Imaging From Free-Text Impressions of Radiology Reports.

作者信息

Li Kathryn W, Lacson Ronilda, Guenette Jeffrey P, DiPiro Pamela J, Burk Kristine S, Kapoor Neena, Salah Fatima, Khorasani Ramin

机构信息

Center for Evidence-Based Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, 1620 Tremont St, Boston, MA 02120.

出版信息

AJR Am J Roentgenol. 2025 Apr;224(4):e2432341. doi: 10.2214/AJR.24.32341. Epub 2025 Jan 29.

DOI:10.2214/AJR.24.32341
PMID:39878409
Abstract

Automated extraction of actionable details of recommendations for additional imaging (RAIs) from radiology reports could facilitate tracking and timely completion of clinically necessary RAIs and thereby potentially reduce diagnostic delays. The purpose of the study was to assess the performance of large language models (LLMs) in extracting actionable details of RAIs from radiology reports. This retrospective single-center study evaluated reports of diagnostic radiology examinations performed across modalities and care settings within five subspecialties (abdominal imaging, musculoskeletal imaging, neuroradiology, nuclear medicine, thoracic imaging) in August 2023. Of reports identified by a previously validated natural language processing algorithm to contain an RAI, 250 were randomly selected; 231 of these reports were confirmed to contain an RAI on manual review and formed the study sample. Twenty-five reports were used to engineer a prompt instructing an LLM, when inputted in a report impression containing an RAI, to extract details about the modality, body part, time frame, and rationale of the RAI; the remaining 206 reports were used for testing the prompt in combination with GPT-3.5 and GPT-4. A 4th-year medical student and radiologist from the relevant subspecialty independently classified the LLM outputs as correct versus incorrect for extracting the four actionable details of RAIs in comparison with the report impressions; a third reviewer assisted in resolving discrepancies. Extraction accuracy was summarized and compared between LLMs using consensus assessments. For GPT-3.5 and GPT-4, the two reviewers agreed about classification of LLM output as correct versus incorrect with respect to report impressions for 95.6% and 94.2% for RAI modality, 89.3% and 88.3% for RAI body part, 96.1% and 95.1% for RAI time frame, and 89.8% and 88.8% for RAI rationale, respectively. GPT-4 was more accurate than GPT-3.5 in extracting RAI modality (94.2% [194/206] vs 85.4% [176/206], < .001), RAI body part (86.9% [179/206] vs 77.2% [159/206], = .004), and RAI time frame (99.0% [204/206] vs 95.6% [197/206], = .02). Both LLMs had accuracy of 91.7% (189/206) for extracting RAI rationale. LLMs were used to extract actionable details of RAIs from free-text impression sections of radiology reports; GPT-4 outperformed GPT-3.5. The technique could represent an innovative method to facilitate timely completion of clinically necessary radiologist recommendations.

摘要

从放射学报告中自动提取关于额外影像检查(RAIs)建议的可操作细节,有助于跟踪并及时完成临床必要的RAIs,从而有可能减少诊断延迟。本研究的目的是评估大语言模型(LLMs)从放射学报告中提取RAIs可操作细节的性能。这项回顾性单中心研究评估了2023年8月在五个亚专业(腹部影像、肌肉骨骼影像、神经放射学、核医学、胸部影像)中跨模态和护理环境进行的诊断性放射学检查报告。在通过先前验证的自然语言处理算法识别出包含RAI的报告中,随机选择了250份;其中231份报告经人工审核确认包含RAI,构成了研究样本。25份报告用于设计一个提示,当输入包含RAI的报告印象时,指导LLM提取RAI的模态、身体部位、时间框架和理由的细节;其余206份报告用于结合GPT - 3.5和GPT - 4测试该提示。一名四年级医学生和来自相关亚专业的放射科医生独立将LLM输出与报告印象相比,就提取RAIs的四个可操作细节分类为正确或不正确;第三位审阅者协助解决差异。使用共识评估总结并比较了LLMs之间的提取准确性。对于GPT - 3.5和GPT - 4,两位审阅者在RAI模态方面,就LLM输出与报告印象相比分类为正确或不正确的一致性分别为RAI模态的95.6%和RAI身体部位的94.2%,RAI身体部位的89.3%和88.3%,RAI时间框架的96.1%和95.1%,RAI理由的89.8%和88.8%。在提取RAI模态(94.2% [194/206] 对85.4% [176/​206],P <.001)、RAI身体部位(86.9% [179/206] 对77.2% [159/206],P = 0.004)和RAI时间框架(99.0% [204/206] 对95.6% [197/206],P = 0.02)方面,GPT - 4比GPT - 3.5更准确。两种LLMs在提取RAI理由方面的准确率均为91.7%(189/206)。LLMs用于从放射学报告的自由文本印象部分提取RAIs的可操作细节;GPT - 4的表现优于GPT - 3.5。该技术可能代表一种创新方法,有助于及时完成临床必要的放射科医生建议。

相似文献

1
Use of ChatGPT Large Language Models to Extract Details of Recommendations for Additional Imaging From Free-Text Impressions of Radiology Reports.使用ChatGPT大型语言模型从放射学报告的自由文本印象中提取额外影像学检查建议的详细信息。
AJR Am J Roentgenol. 2025 Apr;224(4):e2432341. doi: 10.2214/AJR.24.32341. Epub 2025 Jan 29.
2
Data extraction from free-text stroke CT reports using GPT-4o and Llama-3.3-70B: the impact of annotation guidelines.使用GPT-4o和Llama-3.3-70B从自由文本中风CT报告中提取数据:注释指南的影响
Eur Radiol Exp. 2025 Jun 19;9(1):61. doi: 10.1186/s41747-025-00600-2.
3
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.揭示GPT-4V在美国医师执照考试(USMLE)问题上高精度背后的隐藏挑战:观察性研究。
J Med Internet Res. 2025 Feb 7;27:e65146. doi: 10.2196/65146.
4
Enhancing Pulmonary Disease Prediction Using Large Language Models With Feature Summarization and Hybrid Retrieval-Augmented Generation: Multicenter Methodological Study Based on Radiology Report.使用具有特征总结和混合检索增强生成功能的大语言模型增强肺部疾病预测:基于放射学报告的多中心方法学研究
J Med Internet Res. 2025 Jun 11;27:e72638. doi: 10.2196/72638.
5
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.
6
Large Language Models and Empathy: Systematic Review.大语言模型与同理心:系统综述
J Med Internet Res. 2024 Dec 11;26:e52597. doi: 10.2196/52597.
7
The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.生成式预训练变换器4(GPT-4)分析三种不同语言医学笔记的潜力:一项回顾性模型评估研究。
Lancet Digit Health. 2025 Jan;7(1):e35-e43. doi: 10.1016/S2589-7500(24)00246-2.
8
Large language models for error detection in radiology reports: a comparative analysis between closed-source and privacy-compliant open-source models.用于放射学报告错误检测的大语言模型:闭源模型与符合隐私规定的开源模型的对比分析
Eur Radiol. 2025 Feb 20. doi: 10.1007/s00330-025-11438-y.
9
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
10
Eliciting adverse effects data from participants in clinical trials.从临床试验参与者中获取不良反应数据。
Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

引用本文的文献

1
Hybrid framework for automated generation of mammography radiology reports.用于自动生成乳腺X线摄影放射学报告的混合框架。
Comput Struct Biotechnol J. 2025 Jul 16;27:3229-3239. doi: 10.1016/j.csbj.2025.07.018. eCollection 2025.