• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

特定医院领域适应对基于BERT模型的神经放射学报告分类的影响。

Impact of hospital-specific domain adaptation on BERT-based models to classify neuroradiology reports.

作者信息

Agarwal Siddharth, Wood David, Murray Benjamin A K, Wei Yiran, Busaidi Ayisha Al, Kafiabadi Sina, Guilhem Emily, Lynch Jeremy, Townend Matthew, Mazumder Asif, Barker Gareth J, Cole James H, Sasieni Peter, Ourselin Sebastien, Modat Marc, Booth Thomas C

机构信息

School of Biomedical Engineering & Imaging Sciences, King's College London, Becket House, London, UK.

Department of Neuroradiology, Ruskin Wing, King's College Hospital NHS Foundation Trust, London, UK.

出版信息

Eur Radiol. 2025 Mar 17. doi: 10.1007/s00330-025-11500-9.

DOI:10.1007/s00330-025-11500-9
PMID:40097844
Abstract

OBJECTIVES

To determine the effectiveness of hospital-specific domain adaptation through masked language modelling (MLM) on BERT-based models' performance in classifying neuroradiology reports, and to compare these models with open-source large language models (LLMs).

MATERIALS AND METHODS

This retrospective study (2008-2019) utilised 126,556 and 86,032 MRI brain reports from two tertiary hospitals-King's College Hospital (KCH) and Guys and St Thomas' Trust (GSTT). Various BERT-based models, including RoBERTa, BioBERT and RadBERT, underwent MLM on unlabelled reports from these centres. The downstream tasks were binary abnormality classification and multi-label classification. Performances of models with and without hospital-specific domain adaptation were compared against each other and LLMs on internal (KCH) and external (GSTT) hold-out test sets. Model performances for binary classification were compared using 2-way and 1-way ANOVA.

RESULTS

All models that underwent hospital-specific domain adaptation performed better than their baseline counterparts (all p-values < 0.001). For binary classification, MLM on all available unlabelled reports (194,467 reports) yielded the highest balanced accuracies (KCH: mean 97.0 ± 0.4% (standard deviation), GSTT: 95.5 ± 1.0%), after which no differences between BERT-based models remained (1-way ANOVA, p-values > 0.05). There was a log-linear relationship between the number of reports and performance. LLama-3.0 70B was the best-performing LLM (KCH: 97.1%, GSTT: 94.0%). Multi-label classification demonstrated consistent performance improvements from MLM for all abnormality categories.

CONCLUSION

Hospital-specific domain adaptation should be considered best practice when deploying BERT-based models in new clinical settings. When labelled data is scarce or unavailable, LLMs can serve as a viable alternative, assuming adequate computational power is accessible.

KEY POINTS

Question BERT-based models can classify radiology reports, but it is unclear if there is any incremental benefit from additional hospital-specific domain adaptation. Findings Hospital-specific domain adaptation resulted in the highest BERT-based model accuracies and performance scaled log-linearly with the number of reports. Clinical relevance BERT-based models after hospital-specific domain adaptation achieve the best classification results provided sufficient high-quality training labels. When labelled data is scarce, LLMs such as Llama-3.0 70B are a viable alternative provided there are sufficient computational resources.

摘要

目的

通过掩码语言建模(MLM)来确定特定医院领域适应对基于BERT的模型在神经放射学报告分类中的性能的有效性,并将这些模型与开源大语言模型(LLM)进行比较。

材料与方法

这项回顾性研究(2008 - 2019年)使用了来自两家三级医院——国王学院医院(KCH)和盖伊及圣托马斯信托医院(GSTT)的126,556份和86,032份脑部MRI报告。各种基于BERT的模型,包括RoBERTa、BioBERT和RadBERT,在来自这些中心的未标记报告上进行了MLM。下游任务是二元异常分类和多标签分类。将有无特定医院领域适应的模型性能在内部(KCH)和外部(GSTT)保留测试集上相互比较,并与LLM进行比较。使用双向和单向方差分析比较二元分类的模型性能。

结果

所有经过特定医院领域适应的模型表现均优于其基线对应模型(所有p值 < 0.001)。对于二元分类,对所有可用的未标记报告(194,467份报告)进行MLM产生了最高的平衡准确率(KCH:平均97.0 ± 0.4%(标准差);GSTT:95.5 ± 1.0%),在此之后基于BERT的模型之间没有差异(单向方差分析,p值 > 0.05)。报告数量与性能之间存在对数线性关系。Llama - 3.0 70B是表现最佳的LLM(KCH:97.1%,GSTT:94.0%)。多标签分类表明,对于所有异常类别,MLM都带来了一致的性能提升。

结论

在新的临床环境中部署基于BERT的模型时,特定医院领域适应应被视为最佳实践。当标记数据稀缺或不可用时,假设具备足够的计算能力,LLM可以作为一种可行的替代方案。

关键点

问题基于BERT的模型可以对放射学报告进行分类,但尚不清楚额外的特定医院领域适应是否有任何增量益处。发现特定医院领域适应导致基于BERT的模型准确率最高,且性能与报告数量呈对数线性缩放。临床相关性经过特定医院领域适应的基于BERT的模型在提供足够高质量训练标签的情况下可实现最佳分类结果。当标记数据稀缺时,诸如Llama - 3.0 70B之类的LLM在有足够计算资源的情况下是一种可行替代方案。

相似文献

1
Impact of hospital-specific domain adaptation on BERT-based models to classify neuroradiology reports.特定医院领域适应对基于BERT模型的神经放射学报告分类的影响。
Eur Radiol. 2025 Mar 17. doi: 10.1007/s00330-025-11500-9.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
[Volume and health outcomes: evidence from systematic reviews and from evaluation of Italian hospital data].[容量与健康结果:来自系统评价和意大利医院数据评估的证据]
Epidemiol Prev. 2013 Mar-Jun;37(2-3 Suppl 2):1-100.
4
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.
5
Sexual Harassment and Prevention Training性骚扰与预防培训
6
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。
Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验:对定性文献的系统综述
JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.
9
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
10
Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病:网络荟萃分析。
Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

引用本文的文献

1
Automated Protocol Suggestions for Cranial MRI Examinations Using Locally Fine-tuned BERT Models.使用局部微调BERT模型的头颅MRI检查自动协议建议
Clin Neuroradiol. 2025 Aug 18. doi: 10.1007/s00062-025-01554-z.

本文引用的文献

1
Checklist for Artificial Intelligence in Medical Imaging (CLAIM): 2024 Update.医学影像人工智能应用清单(CLAIM):2024 年更新版。
Radiol Artif Intell. 2024 Jul;6(4):e240300. doi: 10.1148/ryai.240300.
2
Letter to the editor: what are the legal and ethical considerations of submitting radiology reports to ChatGPT?致编辑的信:向ChatGPT提交放射学报告有哪些法律和伦理考量?
Clin Radiol. 2024 Jul;79(7):e979-e981. doi: 10.1016/j.crad.2024.03.017. Epub 2024 Apr 9.
3
Optimising brain age estimation through transfer learning: A suite of pre-trained foundation models for improved performance and generalisability in a clinical setting.
通过迁移学习优化脑龄估计:一套预训练基础模型,用于在临床环境中提高性能和通用性。
Hum Brain Mapp. 2024 Mar;45(4):e26625. doi: 10.1002/hbm.26625.
4
Glioblastoma and radiotherapy: A multicenter AI study for Survival Predictions from MRI (GRASP study).胶质母细胞瘤和放疗:来自 MRI 的生存预测的多中心人工智能研究(GRASP 研究)。
Neuro Oncol. 2024 Jun 3;26(6):1138-1151. doi: 10.1093/neuonc/noae017.
5
Factors affecting the labelling accuracy of brain MRI studies relevant for deep learning abnormality detection.影响与深度学习异常检测相关的脑磁共振成像研究标注准确性的因素。
Front Radiol. 2023 Nov 27;3:1251825. doi: 10.3389/fradi.2023.1251825. eCollection 2023.
6
Domain-adapted Large Language Models for Classifying Nuclear Medicine Reports.用于核医学报告分类的领域自适应大语言模型
Radiol Artif Intell. 2023 Sep 27;5(6):e220281. doi: 10.1148/ryai.220281. eCollection 2023 Nov.
7
Natural language processing to predict isocitrate dehydrogenase genotype in diffuse glioma using MR radiology reports.基于磁共振影像学报告的自然语言处理预测弥漫性脑胶质瘤异柠檬酸脱氢酶基因型
Eur Radiol. 2023 Nov;33(11):8017-8025. doi: 10.1007/s00330-023-10061-z. Epub 2023 Aug 11.
8
Improved Fine-Tuning of In-Domain Transformer Model for Inferring COVID-19 Presence in Multi-Institutional Radiology Reports.改进领域内的 Transformer 模型微调,用于推断多机构放射学报告中的 COVID-19 存在情况。
J Digit Imaging. 2023 Feb;36(1):164-177. doi: 10.1007/s10278-022-00714-8. Epub 2022 Nov 2.
9
Performance of Multiple Pretrained BERT Models to Automate and Accelerate Data Annotation for Large Datasets.多个预训练BERT模型在为大型数据集自动执行和加速数据标注方面的性能。
Radiol Artif Intell. 2022 Jun 29;4(4):e220007. doi: 10.1148/ryai.220007. eCollection 2022 Jul.
10
RadBERT: Adapting Transformer-based Language Models to Radiology.RadBERT:使基于Transformer的语言模型适用于放射学领域。
Radiol Artif Intell. 2022 Jun 15;4(4):e210258. doi: 10.1148/ryai.210258. eCollection 2022 Jul.