• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用BERT模型对非结构化报告的诊断目标进行准确可靠的分类。

Accurate and Reliable Classification of Unstructured Reports on Their Diagnostic Goal Using BERT Models.

作者信息

Rietberg Max Tigo, Nguyen Van Bach, Geerdink Jeroen, Vijlbrief Onno, Seifert Christin

机构信息

Faculty of EEMCS, University of Twente, 7500 AE Enschede, The Netherlands.

Institute for Artificial Intelligence in Medicine, University of Duisburg-Essen, 45131 Essen, Germany.

出版信息

Diagnostics (Basel). 2023 Mar 27;13(7):1251. doi: 10.3390/diagnostics13071251.

DOI:10.3390/diagnostics13071251
PMID:37046469
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10093295/
Abstract

Understanding the diagnostic goal of medical reports is valuable information for understanding patient flows. This work focuses on extracting the reason for taking an MRI scan of Multiple Sclerosis (MS) patients using the attached free-form reports: Diagnosis, Progression or Monitoring. We investigate the performance of domain-dependent and general state-of-the-art language models and their alignment with domain expertise. To this end, eXplainable Artificial Intelligence (XAI) techniques are used to acquire insight into the inner workings of the model, which are verified on their trustworthiness. The verified XAI explanations are then compared with explanations from a domain expert, to indirectly determine the reliability of the model. BERTje, a Dutch Bidirectional Encoder Representations from Transformers (BERT) model, outperforms RobBERT and MedRoBERTa.nl in both accuracy and reliability. The latter model (MedRoBERTa.nl) is a domain-specific model, while BERTje is a generic model, showing that domain-specific models are not always superior. Our validation of BERTje in a small prospective study shows promising results for the potential uptake of the model in a practical setting.

摘要

了解医学报告的诊断目标对于理解患者流程是有价值的信息。这项工作专注于利用所附的自由格式报告提取对多发性硬化症(MS)患者进行磁共振成像(MRI)扫描的原因:诊断、病情进展或监测。我们研究了领域特定和通用的先进语言模型的性能及其与领域专业知识的一致性。为此,可解释人工智能(XAI)技术被用于深入了解模型的内部运作,并对其可信度进行验证。然后将经过验证的XAI解释与领域专家的解释进行比较,以间接确定模型的可靠性。荷兰的基于变换器的双向编码器表征(BERT)模型BERTje在准确性和可靠性方面均优于RobBERT和MedRoBERTa.nl。后一种模型(MedRoBERTa.nl)是特定领域模型,而BERTje是通用模型,这表明特定领域模型并不总是更优越。我们在一项小型前瞻性研究中对BERTje的验证显示,该模型在实际应用中具有潜在应用前景,结果令人鼓舞。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17af/10093295/ab2592f33140/diagnostics-13-01251-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17af/10093295/a20eda07e9d1/diagnostics-13-01251-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17af/10093295/ab2592f33140/diagnostics-13-01251-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17af/10093295/a20eda07e9d1/diagnostics-13-01251-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17af/10093295/ab2592f33140/diagnostics-13-01251-g002.jpg

相似文献

1
Accurate and Reliable Classification of Unstructured Reports on Their Diagnostic Goal Using BERT Models.使用BERT模型对非结构化报告的诊断目标进行准确可靠的分类。
Diagnostics (Basel). 2023 Mar 27;13(7):1251. doi: 10.3390/diagnostics13071251.
2
Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).使用基于转换器的双向编码器表示 (BERT) 和领域内预训练 (IDPT) 对耳鸣患者的可操作放射学报告进行自动文本分类。
BMC Med Inform Decis Mak. 2022 Jul 30;22(1):200. doi: 10.1186/s12911-022-01946-y.
3
Extracting patient lifestyle characteristics from Dutch clinical text with BERT models.使用 BERT 模型从荷兰临床文本中提取患者生活方式特征。
BMC Med Inform Decis Mak. 2024 Jun 3;24(1):151. doi: 10.1186/s12911-024-02557-5.
4
Extracting Multiple Worries From Breast Cancer Patient Blogs Using Multilabel Classification With the Natural Language Processing Model Bidirectional Encoder Representations From Transformers: Infodemiology Study of Blogs.使用基于Transformer的自然语言处理模型双向编码器表征的多标签分类从乳腺癌患者博客中提取多种担忧:博客的信息流行病学研究
JMIR Cancer. 2022 Jun 3;8(2):e37840. doi: 10.2196/37840.
5
Developing Artificial Intelligence Models for Extracting Oncologic Outcomes from Japanese Electronic Health Records.开发人工智能模型,从日本电子健康记录中提取肿瘤学结局。
Adv Ther. 2023 Mar;40(3):934-950. doi: 10.1007/s12325-022-02397-7. Epub 2022 Dec 22.
6
Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.基于 BERT(来自 Transformers 的双向编码器表示)的深度学习方法在提取中文放射学报告证据中的应用:计算机辅助肝癌诊断框架的开发。
J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.
7
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN(带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合)模型的医患对话多标签分类:命名实体研究
JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.
8
Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers.使用来自 Transformer 的双向编码器表示自动检测可操作的放射学报告。
BMC Med Inform Decis Mak. 2021 Sep 11;21(1):262. doi: 10.1186/s12911-021-01623-6.
9
The natural language processing of radiology requests and reports of chest imaging: Comparing five transformer models' multilabel classification and a proof-of-concept study.胸部影像学影像请求和报告的自然语言处理:比较五种变压器模型的多标签分类和概念验证研究。
Health Informatics J. 2022 Oct-Dec;28(4):14604582221131198. doi: 10.1177/14604582221131198.
10
An Evaluation of Pretrained BERT Models for Comparing Semantic Similarity Across Unstructured Clinical Trial Texts.基于预训练 BERT 模型评估非结构化临床试验文本间语义相似度的比较
Stud Health Technol Inform. 2022 Jan 14;289:18-21. doi: 10.3233/SHTI210848.

引用本文的文献

1
The DRAGON benchmark for clinical NLP.临床自然语言处理的DRAGON基准测试。
NPJ Digit Med. 2025 May 17;8(1):289. doi: 10.1038/s41746-025-01626-x.
2
Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification.使用跨度和文档级特征分类从非结构化荷兰语超声心动图报告中提取诊断信息。
BMC Med Inform Decis Mak. 2025 Mar 7;25(1):115. doi: 10.1186/s12911-025-02897-w.
3
Automated Identification of Breast Cancer Relapse in Computed Tomography Reports Using Natural Language Processing.

本文引用的文献

1
Deep learning to automate the labelling of head MRI datasets for computer vision applications.深度学习实现头部MRI数据集标注自动化以用于计算机视觉应用。
Eur Radiol. 2022 Jan;32(1):725-736. doi: 10.1007/s00330-021-08132-0. Epub 2021 Jul 20.
2
Automatic Diagnosis of Spinal Disorders on Radiographic Images: Leveraging Existing Unstructured Datasets With Natural Language Processing.利用自然语言处理技术对现有非结构化数据集进行脊柱疾病的影像学自动诊断
Global Spine J. 2023 Jun;13(5):1257-1266. doi: 10.1177/21925682211026910. Epub 2021 Jul 5.
3
2021 MAGNIMS-CMSC-NAIMS consensus recommendations on the use of MRI in patients with multiple sclerosis.
使用自然语言处理技术在计算机断层扫描报告中自动识别乳腺癌复发情况
JCO Clin Cancer Inform. 2024 Dec;8:e2400107. doi: 10.1200/CCI.24.00107. Epub 2024 Dec 20.
4
Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database.用于整理经胸超声心动图(TTE)数据库的自然语言处理系统的开发与评估
Bioengineering (Basel). 2023 Nov 10;10(11):1307. doi: 10.3390/bioengineering10111307.
2021 年 MAGNIMS-CMSC-NAIMS 关于多发性硬化症患者使用 MRI 的共识建议。
Lancet Neurol. 2021 Aug;20(8):653-670. doi: 10.1016/S1474-4422(21)00095-8. Epub 2021 Jun 14.
4
Electronic health records contain dispersed risk factor information that could be used to prevent breast and ovarian cancer.电子健康记录包含分散的风险因素信息,这些信息可用于预防乳腺癌和卵巢癌。
J Am Med Inform Assoc. 2020 Jul 1;27(9):1443-1449. doi: 10.1093/jamia/ocaa152.
5
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
6
Medication Accuracy in Electronic Health Records for Microbial Keratitis.电子健康记录中微生物性角膜炎用药的准确性
JAMA Ophthalmol. 2019 Aug 1;137(8):929-931. doi: 10.1001/jamaophthalmol.2019.1444.
7
Automating Ischemic Stroke Subtype Classification Using Machine Learning and Natural Language Processing.使用机器学习和自然语言处理实现缺血性中风亚型分类的自动化
J Stroke Cerebrovasc Dis. 2019 Jul;28(7):2045-2051. doi: 10.1016/j.jstrokecerebrovasdis.2019.02.004. Epub 2019 May 15.
8
Natural Language Processing for the Identification of Silent Brain Infarcts From Neuroimaging Reports.用于从神经影像报告中识别无症状脑梗死的自然语言处理
JMIR Med Inform. 2019 Apr 21;7(2):e12109. doi: 10.2196/12109.
9
Natural language processing and machine learning algorithm to identify brain MRI reports with acute ischemic stroke.自然语言处理和机器学习算法识别急性缺血性脑卒中的脑部 MRI 报告。
PLoS One. 2019 Feb 28;14(2):e0212778. doi: 10.1371/journal.pone.0212778. eCollection 2019.
10
Racial Difference in Cerebral Microbleed Burden Among a Patient Population in the Mid-South United States.
J Stroke Cerebrovasc Dis. 2018 Oct;27(10):2657-2661. doi: 10.1016/j.jstrokecerebrovasdis.2018.05.031. Epub 2018 Jun 23.