• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

推进乳腺癌诊断:用于组织病理学图像更快、更准确分类的令牌视觉变换器

Advancing breast cancer diagnosis: token vision transformers for faster and accurate classification of histopathology images.

作者信息

Abimouloud Mouhamed Laid, Bensid Khaled, Elleuch Mohamed, Ammar Mohamed Ben, Kherallah Monji

机构信息

National Engineering School of Sfax, University of Sfax, Sfax, Tunisia.

Advanced Technologies for Environment and Smart Cities (ATES Unit), Sfax University, Sfax, Tunisia.

出版信息

Vis Comput Ind Biomed Art. 2025 Jan 8;8(1):1. doi: 10.1186/s42492-024-00181-8.

DOI:10.1186/s42492-024-00181-8
PMID:39775534
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11711433/
Abstract

The vision transformer (ViT) architecture, with its attention mechanism based on multi-head attention layers, has been widely adopted in various computer-aided diagnosis tasks due to its effectiveness in processing medical image information. ViTs are notably recognized for their complex architecture, which requires high-performance GPUs or CPUs for efficient model training and deployment in real-world medical diagnostic devices. This renders them more intricate than convolutional neural networks (CNNs). This difficulty is also challenging in the context of histopathology image analysis, where the images are both limited and complex. In response to these challenges, this study proposes a TokenMixer hybrid-architecture that combines the strengths of CNNs and ViTs. This hybrid architecture aims to enhance feature extraction and classification accuracy with shorter training time and fewer parameters by minimizing the number of input patches employed during training, while incorporating tokenization of input patches using convolutional layers and encoder transformer layers to process patches across all network layers for fast and accurate breast cancer tumor subtype classification. The TokenMixer mechanism is inspired by the ConvMixer and TokenLearner models. First, the ConvMixer model dynamically generates spatial attention maps using convolutional layers, enabling the extraction of patches from input images to minimize the number of input patches used in training. Second, the TokenLearner model extracts relevant regions from the selected input patches, tokenizes them to improve feature extraction, and trains all tokenized patches in an encoder transformer network. We evaluated the TokenMixer model on the BreakHis public dataset, comparing it with ViT-based and other state-of-the-art methods. Our approach achieved impressive results for both binary and multi-classification of breast cancer subtypes across various magnification levels (40×, 100×, 200×, 400×). The model demonstrated accuracies of 97.02% for binary classification and 93.29% for multi-classification, with decision times of 391.71 and 1173.56 s, respectively. These results highlight the potential of our hybrid deep ViT-CNN architecture for advancing tumor classification in histopathological images. The source code is accessible: https://github.com/abimouloud/TokenMixer .

摘要

视觉Transformer(ViT)架构,凭借其基于多头注意力层的注意力机制,因其在处理医学图像信息方面的有效性,已在各种计算机辅助诊断任务中得到广泛应用。ViT因其复杂的架构而备受关注,这种架构需要高性能的GPU或CPU才能在实际医疗诊断设备中进行高效的模型训练和部署。这使得它们比卷积神经网络(CNN)更加复杂。在组织病理学图像分析的背景下,这一难题也具有挑战性,因为组织病理学图像数量有限且复杂。为应对这些挑战,本研究提出了一种TokenMixer混合架构,它结合了CNN和ViT的优势。这种混合架构旨在通过在训练期间最小化使用的输入补丁数量来提高特征提取和分类准确率,同时减少训练时间和参数数量,同时使用卷积层和编码器Transformer层对输入补丁进行令牌化处理,以便在所有网络层中处理补丁,从而实现快速准确的乳腺癌肿瘤亚型分类。TokenMixer机制的灵感来自ConvMixer和TokenLearner模型。首先,ConvMixer模型使用卷积层动态生成空间注意力图,从而能够从输入图像中提取补丁,以最小化训练中使用的输入补丁数量。其次,TokenLearner模型从选定的输入补丁中提取相关区域,对其进行令牌化以改进特征提取,并在编码器Transformer网络中对所有令牌化补丁进行训练。我们在BreakHis公共数据集上评估了TokenMixer模型,并将其与基于ViT的方法和其他最新方法进行了比较。我们的方法在各种放大倍数(40倍、100倍、200倍、400倍)下对乳腺癌亚型的二元分类和多分类都取得了令人印象深刻的结果。该模型在二元分类中的准确率为97.02%,在多分类中的准确率为93.29%,决策时间分别为391.71秒和1173.56秒。这些结果凸显了我们的混合深度ViT-CNN架构在推进组织病理学图像中的肿瘤分类方面的潜力。源代码可访问:https://github.com/abimouloud/TokenMixer 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/15357bc87756/42492_2024_181_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/b83a6a90cbea/42492_2024_181_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/1a1965df5341/42492_2024_181_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/2517c4eb3d2f/42492_2024_181_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/48ce922a3a3a/42492_2024_181_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/13548028c74b/42492_2024_181_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/2ea0d3006431/42492_2024_181_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/afc0fbf3dd70/42492_2024_181_Figc_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/10338922d17a/42492_2024_181_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/4e5a93d722eb/42492_2024_181_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/fbb5df4254a9/42492_2024_181_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/19d980b205e4/42492_2024_181_Fige_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/6f323e143a9b/42492_2024_181_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/c6ead17595ca/42492_2024_181_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/759109e825be/42492_2024_181_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/d3723c9c9d74/42492_2024_181_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/9a5b9715c113/42492_2024_181_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/76b400e590cc/42492_2024_181_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/626a2669ec2d/42492_2024_181_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/682ecbb8c984/42492_2024_181_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/e03cb8fa3f21/42492_2024_181_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/1f19bfd3283f/42492_2024_181_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/15357bc87756/42492_2024_181_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/b83a6a90cbea/42492_2024_181_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/1a1965df5341/42492_2024_181_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/2517c4eb3d2f/42492_2024_181_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/48ce922a3a3a/42492_2024_181_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/13548028c74b/42492_2024_181_Figb_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/2ea0d3006431/42492_2024_181_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/afc0fbf3dd70/42492_2024_181_Figc_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/10338922d17a/42492_2024_181_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/4e5a93d722eb/42492_2024_181_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/fbb5df4254a9/42492_2024_181_Figd_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/19d980b205e4/42492_2024_181_Fige_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/6f323e143a9b/42492_2024_181_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/c6ead17595ca/42492_2024_181_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/759109e825be/42492_2024_181_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/d3723c9c9d74/42492_2024_181_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/9a5b9715c113/42492_2024_181_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/76b400e590cc/42492_2024_181_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/626a2669ec2d/42492_2024_181_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/682ecbb8c984/42492_2024_181_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/e03cb8fa3f21/42492_2024_181_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/1f19bfd3283f/42492_2024_181_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05f4/11711433/15357bc87756/42492_2024_181_Fig17_HTML.jpg

相似文献

1
Advancing breast cancer diagnosis: token vision transformers for faster and accurate classification of histopathology images.推进乳腺癌诊断:用于组织病理学图像更快、更准确分类的令牌视觉变换器
Vis Comput Ind Biomed Art. 2025 Jan 8;8(1):1. doi: 10.1186/s42492-024-00181-8.
2
HTC-retina: A hybrid retinal diseases classification model using transformer-Convolutional Neural Network from optical coherence tomography images.HTC-retina:一种使用来自光学相干断层扫描图像的变压器-卷积神经网络的混合视网膜疾病分类模型。
Comput Biol Med. 2024 Aug;178:108726. doi: 10.1016/j.compbiomed.2024.108726. Epub 2024 Jun 9.
3
Enhanced Pneumonia Detection in Chest X-Rays Using Hybrid Convolutional and Vision Transformer Networks.使用混合卷积和视觉Transformer网络增强胸部X光片中的肺炎检测
Curr Med Imaging. 2025;21:e15734056326685. doi: 10.2174/0115734056326685250101113959.
4
RT-ViT: Real-Time Monocular Depth Estimation Using Lightweight Vision Transformers.RT-ViT:基于轻量级视觉Transformer 的实时单目深度估计。
Sensors (Basel). 2022 May 19;22(10):3849. doi: 10.3390/s22103849.
5
From Binary to Multi-Class Classification: A Two-Step Hybrid CNN-ViT Model for Chest Disease Classification Based on X-Ray Images.从二分类到多分类:一种基于X射线图像的胸部疾病分类的两步混合卷积神经网络-视觉Transformer模型
Diagnostics (Basel). 2024 Dec 6;14(23):2754. doi: 10.3390/diagnostics14232754.
6
TAC-UNet: transformer-assisted convolutional neural network for medical image segmentation.TAC-UNet:用于医学图像分割的Transformer辅助卷积神经网络。
Quant Imaging Med Surg. 2024 Dec 5;14(12):8824-8839. doi: 10.21037/qims-24-1229. Epub 2024 Nov 5.
7
Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像(MRI)中进行脑肿瘤分割与检测
Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.
8
MuSiC-ViT: A multi-task Siamese convolutional vision transformer for differentiating change from no-change in follow-up chest radiographs.MuSiC-ViT:一种用于区分随访胸部 X 光片上变化与无变化的多任务暹罗卷积视觉Transformer。
Med Image Anal. 2023 Oct;89:102894. doi: 10.1016/j.media.2023.102894. Epub 2023 Jul 12.
9
Fusing global context with multiscale context for enhanced breast cancer classification.融合全局上下文和多尺度上下文以提高乳腺癌分类。
Sci Rep. 2024 Nov 9;14(1):27358. doi: 10.1038/s41598-024-78363-w.
10
Attention-Based Deep Learning Approach for Breast Cancer Histopathological Image Multi-Classification.基于注意力机制的深度学习方法用于乳腺癌组织病理学图像多分类
Diagnostics (Basel). 2024 Jul 1;14(13):1402. doi: 10.3390/diagnostics14131402.

本文引用的文献

1
CrossFormer++: A Versatile Vision Transformer Hinging on Cross-Scale Attention.CrossFormer++:一种基于跨尺度注意力的通用视觉Transformer
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3123-3136. doi: 10.1109/TPAMI.2023.3341806. Epub 2024 Apr 3.
2
Convolution Neural Network for Breast Cancer Detection and Classification Using Deep Learning.卷积神经网络在基于深度学习的乳腺癌检测与分类中的应用。
Asian Pac J Cancer Prev. 2023 Feb 1;24(2):531-544. doi: 10.31557/APJCP.2023.24.2.531.
3
FabNet: A Features Agglomeration-Based Convolutional Neural Network for Multiscale Breast Cancer Histopathology Images Classification.
FabNet:一种基于特征聚合的卷积神经网络,用于多尺度乳腺癌组织病理学图像分类。
Cancers (Basel). 2023 Feb 5;15(4):1013. doi: 10.3390/cancers15041013.
4
Classification of benign and malignant subtypes of breast cancer histopathology imaging using hybrid CNN-LSTM based transfer learning.基于混合 CNN-LSTM 的迁移学习的乳腺癌组织病理学成像的良恶性亚型分类。
BMC Med Imaging. 2023 Jan 30;23(1):19. doi: 10.1186/s12880-023-00964-0.
5
Vision-Transformer-Based Transfer Learning for Mammogram Classification.基于视觉Transformer的乳房X光照片分类迁移学习
Diagnostics (Basel). 2023 Jan 4;13(2):178. doi: 10.3390/diagnostics13020178.
6
Deep Learning Based Methods for Breast Cancer Diagnosis: A Systematic Review and Future Direction.基于深度学习的乳腺癌诊断方法:系统综述与未来方向
Diagnostics (Basel). 2023 Jan 3;13(1):161. doi: 10.3390/diagnostics13010161.
7
Breast cancer histopathological images classification based on deep semantic features and gray level co-occurrence matrix.基于深度语义特征和灰度共生矩阵的乳腺癌组织病理图像分类。
PLoS One. 2022 May 5;17(5):e0267955. doi: 10.1371/journal.pone.0267955. eCollection 2022.
8
Breast cancer detection using artificial intelligence techniques: A systematic literature review.基于人工智能技术的乳腺癌检测:系统文献回顾。
Artif Intell Med. 2022 May;127:102276. doi: 10.1016/j.artmed.2022.102276. Epub 2022 Mar 5.
9
Deep learning model for fully automated breast cancer detection system from thermograms.基于热图像的全自动乳腺癌检测系统的深度学习模型。
PLoS One. 2022 Jan 14;17(1):e0262349. doi: 10.1371/journal.pone.0262349. eCollection 2022.
10
Global patterns of breast cancer incidence and mortality: A population-based cancer registry data analysis from 2000 to 2020.全球乳腺癌发病和死亡模式:基于 2000 年至 2020 年癌症登记处数据的分析。
Cancer Commun (Lond). 2021 Nov;41(11):1183-1194. doi: 10.1002/cac2.12207. Epub 2021 Aug 16.