文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

使用带注意力融合的掩码自动编码器和视觉变换器的多模态深度学习对头颈部癌症进行分期分类

Multimodal Deep Learning for Stage Classification of Head and Neck Cancer Using Masked Autoencoders and Vision Transformers with Attention-Based Fusion.

作者信息

Turki Anas, Alshabrawy Ossama, Woo Wai Lok

机构信息

Department of Computer and Information Science, Faculty of Engineering and Environment, Northumbria University, Newcastle upon Tyne NE1 8ST, UK.

出版信息

Cancers (Basel). 2025 Jun 24;17(13):2115. doi: 10.3390/cancers17132115.


DOI:10.3390/cancers17132115
PMID:40647415
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12248538/
Abstract

Head and neck squamous cell carcinoma (HNSCC) is a prevalent and aggressive cancer, and accurate staging using the AJCC system is essential for treatment planning. This study aims to enhance AJCC staging by integrating both clinical and imaging data using a multimodal deep learning pipeline. We propose a framework that employs a VGG16-based masked autoencoder (MAE) for self-supervised visual feature learning, enhanced by attention mechanisms (CBAM and BAM), and fuses image and clinical features using an attention-weighted fusion network. The models, benchmarked on the HNSCC and HN1 datasets, achieved approximately 80% accuracy (four classes) and ~66% accuracy (five classes), with notable AUC improvements, especially under BAM. The integration of clinical features significantly enhances stage-classification performance, setting a precedent for robust multimodal pipelines in radiomics-based oncology applications.

摘要

头颈部鳞状细胞癌(HNSCC)是一种常见且侵袭性强的癌症,使用美国癌症联合委员会(AJCC)系统进行准确分期对于治疗规划至关重要。本研究旨在通过使用多模态深度学习管道整合临床和影像数据来改进AJCC分期。我们提出了一个框架,该框架采用基于VGG16的掩码自动编码器(MAE)进行自监督视觉特征学习,并通过注意力机制(CBAM和BAM)进行增强,同时使用注意力加权融合网络融合图像和临床特征。在HNSCC和HN1数据集上进行基准测试的模型,实现了约80%的准确率(四类)和约66%的准确率(五类),曲线下面积(AUC)有显著提高,尤其是在BAM下。临床特征的整合显著提高了分期分类性能,为基于放射组学的肿瘤学应用中强大的多模态管道树立了先例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/f29a3d78efc4/cancers-17-02115-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/d8c6377ffeb1/cancers-17-02115-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/1309659698c8/cancers-17-02115-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/51d126936c0c/cancers-17-02115-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/6405b467dce9/cancers-17-02115-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/aa84bbb2094b/cancers-17-02115-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/a0593d4f6afc/cancers-17-02115-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/f29a3d78efc4/cancers-17-02115-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/d8c6377ffeb1/cancers-17-02115-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/1309659698c8/cancers-17-02115-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/51d126936c0c/cancers-17-02115-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/6405b467dce9/cancers-17-02115-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/aa84bbb2094b/cancers-17-02115-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/a0593d4f6afc/cancers-17-02115-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b20d/12248538/f29a3d78efc4/cancers-17-02115-g007.jpg

相似文献

[1]
Multimodal Deep Learning for Stage Classification of Head and Neck Cancer Using Masked Autoencoders and Vision Transformers with Attention-Based Fusion.

Cancers (Basel). 2025-6-24

[2]
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.

JMIR Med Inform. 2025-6-4

[3]
CBAM VGG16: An efficient driver distraction classification using CBAM embedded VGG16 architecture.

Comput Biol Med. 2024-9

[4]
Enhancing Preoperative Diagnosis of Subscapular Muscle Injuries with Shoulder MRI-based Multimodal Radiomics.

Acad Radiol. 2025-2

[5]
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

Front Oncol. 2025-6-18

[6]
Fine-Grained Classification of Pressure Ulcers and Incontinence-Associated Dermatitis Using Multimodal Deep Learning: Algorithm Development and Validation Study.

JMIR AI. 2025-5-1

[7]
Enhancing breast cancer detection on screening mammogram using self-supervised learning and a hybrid deep model of Swin Transformer and convolutional neural networks.

J Med Imaging (Bellingham). 2025-11

[8]
A fake news detection model using the integration of multimodal attention mechanism and residual convolutional network.

Sci Rep. 2025-7-1

[9]
Multimodal medical image-to-image translation via variational autoencoder latent space mapping.

Med Phys. 2025-7

[10]
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.

Br J Dermatol. 2024-7-16

本文引用的文献

[1]
Multimodal fusion model for prognostic prediction and radiotherapy response assessment in head and neck squamous cell carcinoma.

NPJ Digit Med. 2025-5-23

[2]
Deep Learning Model of Primary Tumor and Metastatic Cervical Lymph Nodes From CT for Outcome Predictions in Oropharyngeal Cancer.

JAMA Netw Open. 2025-5-1

[3]
Multimodal contrastive learning for enhanced explainability in pediatric brain tumor molecular diagnosis.

Sci Rep. 2025-3-30

[4]
Machine learning explainability for survival outcome in head and neck squamous cell carcinoma.

Int J Med Inform. 2025-7

[5]
Enhanced lung cancer subtype classification using attention-integrated DeepCNN and radiomic features from CT images: a focus on feature reproducibility.

Discov Oncol. 2025-3-17

[6]
NCCN Guidelines® Insights: Head and Neck Cancers, Version 2.2025.

J Natl Compr Canc Netw. 2025-2

[7]
Multimodal deep learning approaches for precision oncology: a comprehensive review.

Brief Bioinform. 2024-11-22

[8]
A review of deep learning-based information fusion techniques for multimodal medical image classification.

Comput Biol Med. 2024-7

[9]
The clinician-AI interface: intended use and explainability in FDA-cleared AI devices for medical image interpretation.

NPJ Digit Med. 2024-3-26

[10]
Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer.

Cancers (Basel). 2024-2-29

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索