一种具有自适应模型融合和多目标优化的ViT-AMC网络，用于从组织病理学图像中进行可解释的喉肿瘤分级

A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images.

作者信息

Huang Pan, He Peng, Tian Sukun, Ma Mingrui, Feng Peng, Xiao Hualiang, Mercaldo Francesco, Santone Antonella, Qin Jing

出版信息

IEEE Trans Med Imaging. 2023 Jan;42(1):15-28. doi: 10.1109/TMI.2022.3202248. Epub 2022 Dec 29.

DOI:10.1109/TMI.2022.3202248

Abstract

The tumor grading of laryngeal cancer pathological images needs to be accurate and interpretable. The deep learning model based on the attention mechanism-integrated convolution (AMC) block has good inductive bias capability but poor interpretability, whereas the deep learning model based on the vision transformer (ViT) block has good interpretability but weak inductive bias ability. Therefore, we propose an end-to-end ViT-AMC network (ViT-AMCNet) with adaptive model fusion and multiobjective optimization that integrates and fuses the ViT and AMC blocks. However, existing model fusion methods often have negative fusion: 1). There is no guarantee that the ViT and AMC blocks will simultaneously have good feature representation capability. 2). The difference in feature representations learning between the ViT and AMC blocks is not obvious, so there is much redundant information in the two feature representations. Accordingly, we first prove the feasibility of fusing the ViT and AMC blocks based on Hoeffding's inequality. Then, we propose a multiobjective optimization method to solve the problem that ViT and AMC blocks cannot simultaneously have good feature representation. Finally, an adaptive model fusion method integrating the metrics block and the fusion block is proposed to increase the differences between feature representations and improve the deredundancy capability. Our methods improve the fusion ability of ViT-AMCNet, and experimental results demonstrate that ViT-AMCNet significantly outperforms state-of-the-art methods. Importantly, the visualized interpretive maps are closer to the region of interest of concern by pathologists, and the generalization ability is also excellent. Our code is publicly available at https://github.com/Baron-Huang/ViT-AMCNet.

摘要

喉癌病理图像的肿瘤分级需要准确且具有可解释性。基于注意力机制集成卷积（AMC）模块的深度学习模型具有良好的归纳偏差能力，但可解释性较差；而基于视觉Transformer（ViT）模块的深度学习模型具有良好的可解释性，但归纳偏差能力较弱。因此，我们提出了一种具有自适应模型融合和多目标优化的端到端ViT-AMC网络（ViT-AMCNet），该网络集成并融合了ViT和AMC模块。然而，现有的模型融合方法往往存在负融合问题：1）. 无法保证ViT和AMC模块同时具有良好的特征表示能力。2）. ViT和AMC模块之间的特征表示学习差异不明显，因此两种特征表示中存在大量冗余信息。相应地，我们首先基于霍夫丁不等式证明了融合ViT和AMC模块的可行性。然后，我们提出了一种多目标优化方法来解决ViT和AMC模块不能同时具有良好特征表示的问题。最后，提出了一种集成度量模块和融合模块的自适应模型融合方法，以增加特征表示之间的差异并提高去冗余能力。我们的方法提高了ViT-AMCNet的融合能力，实验结果表明ViT-AMCNet显著优于现有方法。重要的是，可视化解释图更接近病理学家关注的感兴趣区域，并且泛化能力也很出色。我们的代码可在https://github.com/Baron-Huang/ViT-AMCNet上公开获取。

相似文献

A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images.

IEEE Trans Med Imaging. 2023 Jan;42(1):15-28. doi: 10.1109/TMI.2022.3202248. Epub 2022 Dec 29.

LA-ViT: A Network With Transformers Constrained by Learned-Parameter-Free Attention for Interpretable Grading in a New Laryngeal Histopathology Image Dataset.

IEEE J Biomed Health Inform. 2024 Jun;28(6):3557-3570. doi: 10.1109/JBHI.2024.3373438. Epub 2024 Jun 6.

Interpretable laryngeal tumor grading of histopathological images via depth domain adaptive network with integration gradient CAM and priori experience-guided attention.

Comput Biol Med. 2023 Mar;154:106447. doi: 10.1016/j.compbiomed.2022.106447. Epub 2022 Dec 20.

ASI-DBNet: An Adaptive Sparse Interactive ResNet-Vision Transformer Dual-Branch Network for the Grading of Brain Cancer Histopathological Images.

Interdiscip Sci. 2023 Mar;15(1):15-31. doi: 10.1007/s12539-022-00532-0. Epub 2022 Jul 9.

FABNet: Fusion Attention Block and Transfer Learning for Laryngeal Cancer Tumor Grading in P63 IHC Histopathology Images.

IEEE J Biomed Health Inform. 2022 Apr;26(4):1696-1707. doi: 10.1109/JBHI.2021.3108999. Epub 2022 Apr 14.

MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer.

IEEE Trans Image Process. 2022;31:5134-5149. doi: 10.1109/TIP.2022.3193288. Epub 2022 Aug 2.

Classification for thyroid nodule using ViT with contrastive learning in ultrasound images.

Comput Biol Med. 2023 Jan;152:106444. doi: 10.1016/j.compbiomed.2022.106444. Epub 2022 Dec 16.

LPCANet: Classification of Laryngeal Cancer Histopathological Images Using a CNN with Position Attention and Channel Attention Mechanisms.

Interdiscip Sci. 2021 Dec;13(4):666-682. doi: 10.1007/s12539-021-00452-5. Epub 2021 Jun 17.

Transformer-based progressive residual network for single image dehazing.

Front Neurorobot. 2022 Dec 6;16:1084543. doi: 10.3389/fnbot.2022.1084543. eCollection 2022.

6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-Based Instance Representation Learning.

IEEE Trans Image Process. 2022;31:6907-6921. doi: 10.1109/TIP.2022.3216980. Epub 2022 Nov 3.

引用本文的文献

Exploring Bioimage Synthesis and Detection via Generative Adversarial Networks: A Multi-Faceted Case Study.

J Imaging. 2025 Jun 27;11(7):214. doi: 10.3390/jimaging11070214.

Intelligent diagnosis model for chest X-ray images diseases based on convolutional neural network.

BMC Med Imaging. 2025 Jul 2;25(1):263. doi: 10.1186/s12880-025-01800-3.

Semantic Consistency Network with Edge Learner and Connectivity Enhancer for Cervical Tumor Segmentation from Histopathology Images.

Interdiscip Sci. 2025 Apr 23. doi: 10.1007/s12539-025-00691-w.

ModuCLIP: multi-scale CLIP framework for predicting foundation pit deformation in multi-modal robotic systems.

Front Neurorobot. 2025 Apr 1;19:1544694. doi: 10.3389/fnbot.2025.1544694. eCollection 2025.

ViT-Based Face Diagnosis Images Analysis for Schizophrenia Detection.

Brain Sci. 2024 Dec 29;15(1):30. doi: 10.3390/brainsci15010030.

Investigating the key principles in two-step heterogeneous transfer learning for early laryngeal cancer identification.

Sci Rep. 2025 Jan 16;15(1):2146. doi: 10.1038/s41598-024-84836-9.

Prediction of PD-L1 tumor positive score in lung squamous cell carcinoma with H&E staining images and deep learning.

Front Artif Intell. 2024 Dec 20;7:1452563. doi: 10.3389/frai.2024.1452563. eCollection 2024.

Towards laryngeal cancer diagnosis using Dandelion Optimizer Algorithm with ensemble learning on biomedical throat region images.

Sci Rep. 2024 Aug 24;14(1):19713. doi: 10.1038/s41598-024-70525-0.

A semi-supervised segmentation method for microscopic hyperspectral pathological images based on multi-consistency learning.

Front Oncol. 2024 Jun 19;14:1396887. doi: 10.3389/fonc.2024.1396887. eCollection 2024.

Efficacy and Safety of Atezolizumab as a PD-L1 Inhibitor in the Treatment of Cervical Cancer: A Systematic Review.

Biomedicines. 2024 Jun 11;12(6):1291. doi: 10.3390/biomedicines12061291.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种具有自适应模型融合和多目标优化的ViT-AMC网络，用于从组织病理学图像中进行可解释的喉肿瘤分级

A ViT-AMC Network With Adaptive Model Fusion and Multiobjective Optimization for Interpretable Laryngeal Tumor Grading From Histopathological Images.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献