• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

构建一个大型语言模型,根据放射科报告中的发现生成印象。

Constructing a Large Language Model to Generate Impressions from Findings in Radiology Reports.

机构信息

From the Department of Radiology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China (Lu Zhang, L.W., Y.Z., Y.F., J.Z., Lin Zhang, G.Y., X. Xie); Winning Health Technology, Shanghai, China (M.L., X. Xu, Z.P., X.C.); and Department of Radiology, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Yan Chang Zhong Rd 301, Shanghai 200040, China (X. Xie).

出版信息

Radiology. 2024 Sep;312(3):e240885. doi: 10.1148/radiol.240885.

DOI:10.1148/radiol.240885
PMID:39287525
Abstract

Background The specialization and complexity of radiology makes the automatic generation of radiologic impressions (ie, a diagnosis with differential diagnosis and management recommendations) challenging. Purpose To develop a large language model (LLM) that generates impressions based on imaging findings and to evaluate its performance in professional and linguistic dimensions. Materials and Methods Six radiologists recorded imaging examination findings from August 2 to 31, 2023, at Shanghai General Hospital and used the developed LLM before routinely writing report impressions for multiple radiologic modalities (CT, MRI, radiography, mammography) and anatomic sites (cranium and face, neck, chest, upper abdomen, lower abdomen, vessels, bone and joint, spine, breast), making necessary corrections and completing the radiologic impression. A subset was defined to investigate cases where the LLM-generated impressions differed from the final radiologist impressions by excluding identical and highly similar cases. An expert panel scored the LLM-generated impressions on a five-point Likert scale (5 = strongly agree) based on scientific terminology, coherence, specific diagnosis, differential diagnosis, management recommendations, correctness, comprehensiveness, harmlessness, and lack of bias. Results In this retrospective study, an LLM was pretrained using 20 GB of medical and general-purpose text data. The fine-tuning data set comprised 1.5 GB of data, including 800 radiology reports with paired instructions (describing the output task in natural language) and outputs. Test set 2 included data from 3988 patients (median age, 56 years [IQR, 40-68 years]; 2159 male). The median recall, precision, and F1 score of LLM-generated impressions were 0.775 (IQR, 0.56-1), 0.84 (IQR, 0.611-1), and 0.772 (IQR, 0.578-0.957), respectively, using the final impressions as the reference standard. In a subset of 1014 patients (median age, 57 years [IQR, 42-69 years]; 528 male), the overall median expert panel score for LLM-generated impressions was 5 (IQR, 5-5), ranging from 4 (IQR, 3-5) to 5 (IQR, 5-5). Conclusion The developed LLM generated radiologic impressions that were professionally and linguistically appropriate for a full spectrum of radiology examinations. © RSNA, 2024

摘要

背景 放射学的专业化和复杂性使得自动生成放射学印象(即具有鉴别诊断和管理建议的诊断)具有挑战性。目的 开发一种基于影像学发现生成印象的大型语言模型(LLM),并评估其在专业和语言维度上的性能。材料与方法 6 名放射科医生于 2023 年 8 月 2 日至 31 日在上海总医院记录影像学检查结果,在常规书写多种影像学方式(CT、MRI、X 线摄影、乳房 X 线摄影)和解剖部位(颅面、颈部、胸部、上腹部、下腹部、血管、骨与关节、脊柱、乳房)的报告印象之前,使用开发的 LLM 进行检查,并进行必要的更正并完成放射学印象。定义了一个子集,通过排除完全相同和高度相似的病例,研究 LLM 生成的印象与最终放射科医生印象不同的病例。一个专家小组根据科学术语、连贯性、具体诊断、鉴别诊断、管理建议、正确性、全面性、无害性和无偏见性,对 LLM 生成的印象进行了五分制 Likert 量表评分(5=非常同意)。结果 在这项回顾性研究中,使用 20GB 的医学和通用文本数据对 LLM 进行了预训练。微调数据集包括 1.5GB 的数据,包括 800 份带有配对指令(用自然语言描述输出任务)和输出的放射学报告。测试集 2 包含 3988 名患者的数据(中位年龄 56 岁[IQR,40-68 岁];2159 名男性)。使用最终印象作为参考标准,LLM 生成的印象的召回率、精度和 F1 评分中位数分别为 0.775(IQR,0.56-1)、0.84(IQR,0.611-1)和 0.772(IQR,0.578-0.957)。在 1014 名患者的子集中(中位年龄 57 岁[IQR,42-69 岁];528 名男性),LLM 生成的印象的专家小组评分中位数总体为 5(IQR,5-5),范围为 4(IQR,3-5)至 5(IQR,5-5)。结论 开发的 LLM 生成的放射学印象在专业和语言上适合各种放射学检查。© RSNA,2024

相似文献

1
Constructing a Large Language Model to Generate Impressions from Findings in Radiology Reports.构建一个大型语言模型,根据放射科报告中的发现生成印象。
Radiology. 2024 Sep;312(3):e240885. doi: 10.1148/radiol.240885.
2
An open-source fine-tuned large language model for radiological impression generation: a multi-reader performance study.开源微调大型语言模型在放射科印象生成中的应用:多读者性能研究。
BMC Med Imaging. 2024 Sep 27;24(1):254. doi: 10.1186/s12880-024-01435-w.
3
Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.使用隐私保护的大型语言模型 Vicuna 对放射科报告进行标注的可行性研究。
Radiology. 2023 Oct;309(1):e231147. doi: 10.1148/radiol.231147.
4
Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports.开源大语言模型从自由文本放射学报告中提取信息的性能。
Radiol Artif Intell. 2024 Jul;6(4):e230364. doi: 10.1148/ryai.230364.
5
Quantitative Evaluation of Large Language Models to Streamline Radiology Report Impressions: A Multimodal Retrospective Analysis.大语言模型在简化放射科报告印象方面的定量评估:一项多模态回顾性分析。
Radiology. 2024 Mar;310(3):e231593. doi: 10.1148/radiol.231593.
6
From jargon to clarity: Improving the readability of foot and ankle radiology reports with an artificial intelligence large language model.从行话到清晰明了:利用人工智能大语言模型提高足踝放射学报告的可读性
Foot Ankle Surg. 2024 Jun;30(4):331-337. doi: 10.1016/j.fas.2024.01.008. Epub 2024 Feb 5.
7
Development and External Validation of an Artificial Intelligence Model for Identifying Radiology Reports Containing Recommendations for Additional Imaging.开发和外部验证用于识别包含额外成像建议的放射学报告的人工智能模型。
AJR Am J Roentgenol. 2023 Sep;221(3):377-385. doi: 10.2214/AJR.23.29120. Epub 2023 Apr 19.
8
Automatic Personalized Impression Generation for PET Reports Using Large Language Models.使用大语言模型自动生成个性化的PET报告印象
ArXiv. 2023 Oct 17:arXiv:2309.10066v2.
9
Decoding Radiology Reports: Artificial Intelligence-Large Language Models Can Improve the Readability of Hand and Wrist Orthopedic Radiology Reports.解读放射学报告:人工智能-大语言模型可提高手部和腕部骨科放射学报告的可读性。
Hand (N Y). 2024 Aug 13:15589447241267766. doi: 10.1177/15589447241267766.
10
Personalized Impression Generation for PET Reports Using Large Language Models.基于大语言模型的 PET 报告个性化印象生成。
J Imaging Inform Med. 2024 Apr;37(2):471-488. doi: 10.1007/s10278-024-00985-3. Epub 2024 Feb 2.

引用本文的文献

1
Performance and improvement strategies for adapting generative large language models for electronic health record applications: A systematic review.将生成式大语言模型应用于电子健康记录的性能及改进策略:一项系统综述
Int J Med Inform. 2025 Aug 28;205:106091. doi: 10.1016/j.ijmedinf.2025.106091.
2
From dictation to diagnosis: enhancing radiology reporting with integrated speech recognition in multimodal large language models.从听写记录到诊断:利用多模态大语言模型中的集成语音识别提升放射学报告水平
Eur Radiol. 2025 Aug 15. doi: 10.1007/s00330-025-11929-y.
3
Foundation models for radiology-the position of the AI for Health Imaging (AI4HI) network.
放射学基础模型——健康影像人工智能(AI4HI)网络的立场
Insights Imaging. 2025 Aug 6;16(1):168. doi: 10.1186/s13244-025-02056-9.
4
Beyond Assistance: The Case for Role Separation in AI-Human Radiology Workflows.超越辅助:人工智能与人类在放射学工作流程中角色分离的理由
Radiology. 2025 Jul;316(1):e250477. doi: 10.1148/radiol.250477.
5
Improving radiology reporting accuracy: use of GPT-4 to reduce errors in reports.提高放射学报告准确性:使用GPT-4减少报告中的错误。
Abdom Radiol (NY). 2025 Jun 27. doi: 10.1007/s00261-025-05079-4.
6
Large Language Models in Cancer Imaging: Applications and Future Perspectives.癌症成像中的大语言模型:应用与未来展望。
J Clin Med. 2025 May 8;14(10):3285. doi: 10.3390/jcm14103285.
7
Evaluation of large language models in generating pulmonary nodule follow-up recommendations.评估大语言模型在生成肺结节随访建议方面的能力。
Eur J Radiol Open. 2025 Apr 30;14:100655. doi: 10.1016/j.ejro.2025.100655. eCollection 2025 Jun.
8
A Current Review of Generative AI in Medicine: Core Concepts, Applications, and Current Limitations.医学中生成式人工智能的当前综述:核心概念、应用及当前局限性
Curr Rev Musculoskelet Med. 2025 Apr 30. doi: 10.1007/s12178-025-09961-y.
9
Comparative benchmarking of the DeepSeek large language model on medical tasks and clinical reasoning.DeepSeek大语言模型在医学任务和临床推理方面的比较基准测试。
Nat Med. 2025 Apr 23. doi: 10.1038/s41591-025-03726-3.
10
Assessing large language models for Lugano classification of malignant lymphoma in Japanese FDG-PET reports.在日本FDG-PET报告中评估用于恶性淋巴瘤卢加诺分类的大语言模型。
EJNMMI Rep. 2025 Mar 10;9(1):8. doi: 10.1186/s41824-025-00246-8.