• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用微调后的大语言模型对脑部磁共振成像报告进行自动分类

Automated classification of brain MRI reports using fine-tuned large language models.

作者信息

Kanzawa Jun, Yasaka Koichiro, Fujita Nana, Fujiwara Shin, Abe Osamu

机构信息

Department of Radiology, The University of Tokyo Hospital, Tokyo, Japan.

出版信息

Neuroradiology. 2024 Dec;66(12):2177-2183. doi: 10.1007/s00234-024-03427-7. Epub 2024 Jul 12.

DOI:10.1007/s00234-024-03427-7
PMID:38995393
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11611921/
Abstract

PURPOSE

This study aimed to investigate the efficacy of fine-tuned large language models (LLM) in classifying brain MRI reports into pretreatment, posttreatment, and nontumor cases.

METHODS

This retrospective study included 759, 284, and 164 brain MRI reports for training, validation, and test dataset. Radiologists stratified the reports into three groups: nontumor (group 1), posttreatment tumor (group 2), and pretreatment tumor (group 3) cases. A pretrained Bidirectional Encoder Representations from Transformers Japanese model was fine-tuned using the training dataset and evaluated on the validation dataset. The model which demonstrated the highest accuracy on the validation dataset was selected as the final model. Two additional radiologists were involved in classifying reports in the test datasets for the three groups. The model's performance on test dataset was compared to that of two radiologists.

RESULTS

The fine-tuned LLM attained an overall accuracy of 0.970 (95% CI: 0.930-0.990). The model's sensitivity for group 1/2/3 was 1.000/0.864/0.978. The model's specificity for group1/2/3 was 0.991/0.993/0.958. No statistically significant differences were found in terms of accuracy, sensitivity, and specificity between the LLM and human readers (p ≥ 0.371). The LLM completed the classification task approximately 20-26-fold faster than the radiologists. The area under the receiver operating characteristic curve for discriminating groups 2 and 3 from group 1 was 0.994 (95% CI: 0.982-1.000) and for discriminating group 3 from groups 1 and 2 was 0.992 (95% CI: 0.982-1.000).

CONCLUSION

Fine-tuned LLM demonstrated a comparable performance with radiologists in classifying brain MRI reports, while requiring substantially less time.

摘要

目的

本研究旨在探讨微调后的大语言模型(LLM)在将脑部MRI报告分类为治疗前、治疗后和非肿瘤病例方面的疗效。

方法

这项回顾性研究包括759份、284份和164份脑部MRI报告,分别用于训练、验证和测试数据集。放射科医生将报告分为三组:非肿瘤(第1组)、治疗后肿瘤(第2组)和治疗前肿瘤(第3组)病例。使用训练数据集对预训练的来自Transformer的日语双向编码器表示模型进行微调,并在验证数据集上进行评估。选择在验证数据集上表现出最高准确率的模型作为最终模型。另外两名放射科医生参与对三组测试数据集中的报告进行分类。将模型在测试数据集上的表现与两名放射科医生的表现进行比较。

结果

微调后的LLM总体准确率达到0.970(95%置信区间:0.930 - 0.990)。该模型对第1/2/3组的敏感性分别为1.000/0.864/0.978。该模型对第1/2/3组的特异性分别为0.991/0.993/0.958。在准确率、敏感性和特异性方面,LLM与人类读者之间未发现统计学上的显著差异(p≥0.371)。LLM完成分类任务比放射科医生快约20 - 26倍。用于区分第2组和第3组与第1组的受试者操作特征曲线下面积为0.994(95%置信区间:0.982 - 1.000),用于区分第3组与第1组和第2组的面积为0.992(95%置信区间:0.982 - 1.000)。

结论

微调后的LLM在对脑部MRI报告进行分类时表现出与放射科医生相当的性能,同时所需时间大大减少。

相似文献

1
Automated classification of brain MRI reports using fine-tuned large language models.使用微调后的大语言模型对脑部磁共振成像报告进行自动分类
Neuroradiology. 2024 Dec;66(12):2177-2183. doi: 10.1007/s00234-024-03427-7. Epub 2024 Jul 12.
2
Fine-Tuned Large Language Model for Extracting Patients on Pretreatment for Lung Cancer from a Picture Archiving and Communication System Based on Radiological Reports.基于放射学报告从图像存档与通信系统中提取肺癌预处理患者的微调大语言模型
J Imaging Inform Med. 2025 Feb;38(1):327-334. doi: 10.1007/s10278-024-01186-8. Epub 2024 Jul 2.
3
GPT-Driven Radiology Report Generation with Fine-Tuned Llama 3.基于微调的Llama 3由GPT驱动的放射学报告生成
Bioengineering (Basel). 2024 Oct 18;11(10):1043. doi: 10.3390/bioengineering11101043.
4
Performance of an Open-Source Large Language Model in Extracting Information from Free-Text Radiology Reports.开源大语言模型从自由文本放射学报告中提取信息的性能。
Radiol Artif Intell. 2024 Jul;6(4):e230364. doi: 10.1148/ryai.230364.
5
The Fine-Tuned Large Language Model for Extracting the Progressive Bone Metastasis from Unstructured Radiology Reports.用于从非结构化放射学报告中提取进行性骨转移的微调大语言模型。
J Imaging Inform Med. 2025 Apr;38(2):865-872. doi: 10.1007/s10278-024-01242-3. Epub 2024 Aug 26.
6
Automated Radiology Report Labeling in Chest X-Ray Pathologies: Development and Evaluation of a Large Language Model Framework.胸部X光病理学中的自动放射学报告标注:大语言模型框架的开发与评估
JMIR Med Inform. 2025 Mar 28;13:e68618. doi: 10.2196/68618.
7
Natural language processing pipeline to extract prostate cancer-related information from clinical notes.从临床记录中提取前列腺癌相关信息的自然语言处理管道。
Eur Radiol. 2024 Dec;34(12):7878-7891. doi: 10.1007/s00330-024-10812-6. Epub 2024 Jun 6.
8
An open-source fine-tuned large language model for radiological impression generation: a multi-reader performance study.开源微调大型语言模型在放射科印象生成中的应用:多读者性能研究。
BMC Med Imaging. 2024 Sep 27;24(1):254. doi: 10.1186/s12880-024-01435-w.
9
Open-source Large Language Models can Generate Labels from Radiology Reports for Training Convolutional Neural Networks.开源大语言模型可从放射学报告生成标签以训练卷积神经网络。
Acad Radiol. 2025 May;32(5):2402-2410. doi: 10.1016/j.acra.2024.12.028. Epub 2025 Jan 6.
10
Automatic Diagnosis Labeling of Cardiovascular MRI by Using Semisupervised Natural Language Processing of Text Reports.利用文本报告的半监督自然语言处理对心血管磁共振成像进行自动诊断标注
Radiol Artif Intell. 2021 Nov 24;4(1):e210085. doi: 10.1148/ryai.210085. eCollection 2022 Jan.

引用本文的文献

1
Leveraging large language models for accurate classification of liver lesions from MRI reports.利用大语言模型对MRI报告中的肝脏病变进行准确分类。
Comput Struct Biotechnol J. 2025 May 21;27:2139-2146. doi: 10.1016/j.csbj.2025.05.019. eCollection 2025.
2
Fine-tuned large Language model for extracting newly identified acute brain infarcts based on computed tomography or magnetic resonance imaging reports.基于计算机断层扫描或磁共振成像报告,用于提取新发现的急性脑梗死的微调大语言模型。
Emerg Radiol. 2025 Jun 2. doi: 10.1007/s10140-025-02354-1.
3
Assessing large language models for Lugano classification of malignant lymphoma in Japanese FDG-PET reports.

本文引用的文献

1
Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications.放射科中的聊天机器人和大型语言模型:临床和研究应用的实用入门指南。
Radiology. 2024 Jan;310(1):e232756. doi: 10.1148/radiol.232756.
2
Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports.基于生成式预训练转换器的自动化放射学报告生成的初步评估:与放射科医生生成的报告进行比较。
Jpn J Radiol. 2024 Feb;42(2):190-200. doi: 10.1007/s11604-023-01487-y. Epub 2023 Sep 15.
3
Feasibility of Differential Diagnosis Based on Imaging Patterns Using a Large Language Model.
在日本FDG-PET报告中评估用于恶性淋巴瘤卢加诺分类的大语言模型。
EJNMMI Rep. 2025 Mar 10;9(1):8. doi: 10.1186/s41824-025-00246-8.
4
Efficacy of Fine-Tuned Large Language Model in CT Protocol Assignment as Clinical Decision-Supporting System.微调大语言模型在CT检查方案分配中作为临床决策支持系统的有效性
J Imaging Inform Med. 2025 Feb 5. doi: 10.1007/s10278-025-01433-6.
5
Classification of Interventional Radiology Reports into Technique Categories with a Fine-Tuned Large Language Model.使用微调的大语言模型将介入放射学报告分类到技术类别中。
J Imaging Inform Med. 2024 Dec 13. doi: 10.1007/s10278-024-01370-w.
6
The Fine-Tuned Large Language Model for Extracting the Progressive Bone Metastasis from Unstructured Radiology Reports.用于从非结构化放射学报告中提取进行性骨转移的微调大语言模型。
J Imaging Inform Med. 2025 Apr;38(2):865-872. doi: 10.1007/s10278-024-01242-3. Epub 2024 Aug 26.
基于成像模式利用大语言模型进行鉴别诊断的可行性
Radiology. 2023 Jul;308(1):e231167. doi: 10.1148/radiol.231167.
4
Evaluating GPT4 on Impressions Generation in Radiology Reports.评估GPT4在生成放射学报告印象方面的表现。
Radiology. 2023 Jun;307(5):e231259. doi: 10.1148/radiol.231259.
5
GPT-4 for Automated Determination of Radiological Study and Protocol based on Radiology Request Forms: A Feasibility Study.基于放射学申请单的放射学研究和方案的 GPT-4 自动确定:一项可行性研究。
Radiology. 2023 Jun;307(5):e230877. doi: 10.1148/radiol.230877.
6
RadBERT: Adapting Transformer-based Language Models to Radiology.RadBERT:使基于Transformer的语言模型适用于放射学领域。
Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul.
7
Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers.使用来自 Transformer 的双向编码器表示自动检测可操作的放射学报告。
BMC Med Inform Decis Mak. 2021 Sep 11;21(1):262. doi: 10.1186/s12911-021-01623-6.
8
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
9
Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports.评估深度自然语言处理在从放射学报告中确定肿瘤学结果方面的应用
JAMA Oncol. 2019 Oct 1;5(10):1421-1429. doi: 10.1001/jamaoncol.2019.1800.
10
Global, regional, and national burden of brain and other CNS cancer, 1990-2016: a systematic analysis for the Global Burden of Disease Study 2016.全球、区域和国家脑和其他中枢神经系统癌症负担,1990-2016 年:2016 年全球疾病负担研究的系统分析。
Lancet Neurol. 2019 Apr;18(4):376-393. doi: 10.1016/S1474-4422(18)30468-X. Epub 2019 Feb 21.