轻量级变压器图像特征提取网络。

Lightweight transformer image feature extraction network.

作者信息

Zheng Wenfeng, Lu Siyu, Yang Youshuai, Yin Zhengtong, Yin Lirong

机构信息

School of Automation, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.

College of Resource and Environment Engineering, Guizhou University, Guiyang, Guizhou, China.

出版信息

PeerJ Comput Sci. 2024 Jan 31;10:e1755. doi: 10.7717/peerj-cs.1755. eCollection 2024.

DOI:10.7717/peerj-cs.1755

PMID:39669455

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11636678/

Abstract

In recent years, the image feature extraction method based on Transformer has become a research hotspot. However, when using Transformer for image feature extraction, the model's complexity increases quadratically with the number of tokens entered. The quadratic complexity prevents vision transformer-based backbone networks from modelling high-resolution images and is computationally expensive. To address this issue, this study proposes two approaches to speed up Transformer models. Firstly, the self-attention mechanism's quadratic complexity is reduced to linear, enhancing the model's internal processing speed. Next, a parameter-less lightweight pruning method is introduced, which adaptively samples input images to filter out unimportant tokens, effectively reducing irrelevant input. Finally, these two methods are combined to create an efficient attention mechanism. Experimental results demonstrate that the combined methods can reduce the computation of the original Transformer model by 30%-50%, while the efficient attention mechanism achieves an impressive 60%-70% reduction in computation.

摘要

近年来，基于Transformer的图像特征提取方法已成为研究热点。然而，在使用Transformer进行图像特征提取时，模型的复杂度会随着输入令牌数量的增加而呈二次方增长。这种二次方复杂度使得基于视觉Transformer的骨干网络难以对高分辨率图像进行建模，并且计算成本高昂。为了解决这个问题，本研究提出了两种加速Transformer模型的方法。首先，将自注意力机制的二次方复杂度降低到线性，提高模型的内部处理速度。其次，引入一种无参数的轻量级剪枝方法，该方法对输入图像进行自适应采样，以滤除不重要的令牌，有效减少无关输入。最后，将这两种方法结合起来创建一个高效的注意力机制。实验结果表明，组合方法可以将原始Transformer模型的计算量减少30%-50%，而高效注意力机制则实现了令人印象深刻的60%-70%的计算量减少。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9215/11636678/96cbc79b893a/peerj-cs-10-1755-g001.jpg

相似文献

Lightweight transformer image feature extraction network.轻量级变压器图像特征提取网络。

PeerJ Comput Sci. 2024 Jan 31;10:e1755. doi: 10.7717/peerj-cs.1755. eCollection 2024.

Vicinity Vision Transformer.邻近视觉变换器

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12635-12649. doi: 10.1109/TPAMI.2023.3285569. Epub 2023 Sep 5.

Multi-tailed vision transformer for efficient inference.多尾视觉转换器，用于高效推理。

Neural Netw. 2024 Jun;174:106235. doi: 10.1016/j.neunet.2024.106235. Epub 2024 Mar 14.

PMFSNet: Polarized multi-scale feature self-attention network for lightweight medical image segmentation.PMFSNet：用于轻量级医学图像分割的极化多尺度特征自注意力网络

Comput Methods Programs Biomed. 2025 Apr;261:108611. doi: 10.1016/j.cmpb.2025.108611. Epub 2025 Jan 25.

Transformer guided self-adaptive network for multi-scale skin lesion image segmentation.Transformer 引导的自适网络用于多尺度皮肤病变图像分割。

Comput Biol Med. 2024 Feb;169:107846. doi: 10.1016/j.compbiomed.2023.107846. Epub 2023 Dec 23.

Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.通过用于视觉和语言任务的分组变换实现轻量级Transformer

IEEE Trans Image Process. 2022;31:3386-3398. doi: 10.1109/TIP.2021.3139234. Epub 2022 May 11.

MCI Net: Mamba- Convolutional lightweight self-attention medical image segmentation network.MCI Net：Mamba-卷积轻量级自注意力医学图像分割网络。

Biomed Phys Eng Express. 2024 Nov 5;11(1). doi: 10.1088/2057-1976/ad8acb.

DiagSWin: A multi-scale vision transformer with diagonal-shaped windows for object detection and segmentation.DiagSWin：一种具有对角线形状窗口的多尺度视觉转换器，用于目标检测和分割。

Neural Netw. 2024 Dec;180:106653. doi: 10.1016/j.neunet.2024.106653. Epub 2024 Aug 22.

SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.SwinCross：用于 PET/CT 图像中头颈部肿瘤分割的跨模态 Swin 变换器。

Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.

ATTransUNet: An enhanced hybrid transformer architecture for ultrasound and histopathology image segmentation.ATTransUNet：一种用于超声和组织病理学图像分割的增强型混合变压器架构。

Comput Biol Med. 2023 Jan;152:106365. doi: 10.1016/j.compbiomed.2022.106365. Epub 2022 Nov 28.

引用本文的文献

LIU-NET: lightweight Inception U-Net for efficient brain tumor segmentation from multimodal 3D MRI images.LIU-NET：用于从多模态3D MRI图像中高效分割脑肿瘤的轻量级Inception U-Net

PeerJ Comput Sci. 2025 Mar 31;11:e2787. doi: 10.7717/peerj-cs.2787. eCollection 2025.

A novel pythonic paradigm for image encryption using axis-aligned bounding boxes.一种使用轴对齐边界框进行图像加密的新型Python范式。

Sci Rep. 2025 May 16;15(1):17076. doi: 10.1038/s41598-025-89397-z.

Distributed Sparse Manifold-Constrained Optimization Algorithm in Linear Discriminant Analysis.线性判别分析中的分布式稀疏流形约束优化算法

J Imaging. 2025 Mar 13;11(3):81. doi: 10.3390/jimaging11030081.

Challenges issues and future recommendations of deep learning techniques for SARS-CoV-2 detection utilising X-ray and CT images: a comprehensive review.利用X射线和CT图像进行SARS-CoV-2检测的深度学习技术面临的挑战、问题及未来建议：全面综述

PeerJ Comput Sci. 2024 Dec 24;10:e2517. doi: 10.7717/peerj-cs.2517. eCollection 2024.

A Hybrid Approach for Sports Activity Recognition Using Key Body Descriptors and Hybrid Deep Learning Classifier.一种使用关键身体描述符和混合深度学习分类器的体育活动识别混合方法。

Sensors (Basel). 2025 Jan 13;25(2):441. doi: 10.3390/s25020441.

Novel embedding model predicting the credit card's default using neural network optimized by harmony search algorithm and vortex search algorithm.基于和声搜索算法和涡旋搜索算法优化的神经网络预测信用卡违约的新型嵌入模型。

Heliyon. 2024 Apr 23;10(9):e30134. doi: 10.1016/j.heliyon.2024.e30134. eCollection 2024 May 15.

Optimizing Image Enhancement: Feature Engineering for Improved Classification in AI-Assisted Artificial Retinas.优化图像增强：人工智能辅助人工视网膜中用于改进分类的特征工程。

Sensors (Basel). 2024 Apr 23;24(9):2678. doi: 10.3390/s24092678.

本文引用的文献

Image super-resolution with an enhanced group convolutional neural network.基于增强群组卷积神经网络的图像超分辨率重建。

Neural Netw. 2022 Sep;153:373-385. doi: 10.1016/j.neunet.2022.06.009. Epub 2022 Jun 11.

A Survey on Vision Transformer.视觉Transformer综述

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):87-110. doi: 10.1109/TPAMI.2022.3152247. Epub 2022 Dec 5.

A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects.卷积神经网络综述：分析、应用与展望

IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):6999-7019. doi: 10.1109/TNNLS.2021.3084827. Epub 2022 Nov 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

轻量级变压器图像特征提取网络。

Lightweight transformer image feature extraction network.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献