• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LAP:基于动态滤波器选择的延迟感知自动剪枝

LAP: Latency-aware automated pruning with dynamic-based filter selection.

作者信息

Chen Zailong, Liu Chubo, Yang Wangdong, Li Kenli, Li Keqin

机构信息

College of Information Science and Engineering, Hunan University, Hunan 410082, China.

Department of Computer Science, State University of New York, New Paltz, NY 12561, USA.

出版信息

Neural Netw. 2022 Aug;152:407-418. doi: 10.1016/j.neunet.2022.05.002. Epub 2022 May 10.

DOI:10.1016/j.neunet.2022.05.002
PMID:35609502
Abstract

Model pruning is widely used to compress and accelerate convolutional neural networks (CNNs). Conventional pruning techniques only focus on how to remove more parameters while ensuring model accuracy. This work not only covers the optimization of model accuracy, but also optimizes the model latency during pruning. When there are multiple optimization objectives, the difficulty of algorithm design increases exponentially. So latency sensitivity is proposed to effectively guide the determination of layer sparsity in this paper. We present the latency-aware automated pruning (LAP) framework which leverages the reinforcement learning to automatically determine the layer sparsity. Latency sensitivity is used as a prior knowledge and involved into the exploration loop. Rather than relying on a single reward signal such as validation accuracy or floating-point operations (FLOPs), our agent receives the feedback on the accuracy error and latency sensitivity. We also provide a novel filter selection algorithm to accurately distinguish important filters within a layer based on their dynamic changes. Compared to the state-of-the-art compression policies, our framework demonstrated superior performances for VGGNet, ResNet, and MobileNet on CIFAR-10, ImageNet, and Food-101. Our LAP allowed the inference latency of MobileNet-V1 to achieve approximately 1.64 times speedup on the Titan RTX GPU, with no loss of ImageNet Top-1 accuracy. It significantly improved the pareto optimal curve on the accuracy and latency trade-off.

摘要

模型剪枝被广泛用于压缩和加速卷积神经网络(CNN)。传统的剪枝技术只关注如何在确保模型准确性的同时去除更多参数。这项工作不仅涵盖了模型准确性的优化,还在剪枝过程中优化了模型延迟。当存在多个优化目标时,算法设计的难度会呈指数级增加。因此,本文提出了延迟敏感度,以有效地指导层稀疏度的确定。我们提出了延迟感知自动剪枝(LAP)框架,该框架利用强化学习自动确定层稀疏度。延迟敏感度被用作先验知识并融入探索循环。我们的智能体不是依赖单一的奖励信号,如验证准确性或浮点运算次数(FLOP),而是接收关于准确性误差和延迟敏感度的反馈。我们还提供了一种新颖的滤波器选择算法,以根据其动态变化准确区分层内的重要滤波器。与当前最先进的压缩策略相比,我们的框架在CIFAR-10、ImageNet和Food-101数据集上,在VGGNet、ResNet和MobileNet模型上展现出了卓越的性能。我们的LAP使得MobileNet-V1在Titan RTX GPU上的推理延迟实现了约1.64倍的加速,且不损失ImageNet Top-1准确率。它显著改善了准确性和延迟权衡上的帕累托最优曲线。

相似文献

1
LAP: Latency-aware automated pruning with dynamic-based filter selection.LAP:基于动态滤波器选择的延迟感知自动剪枝
Neural Netw. 2022 Aug;152:407-418. doi: 10.1016/j.neunet.2022.05.002. Epub 2022 May 10.
2
Dynamical Conventional Neural Network Channel Pruning by Genetic Wavelet Channel Search for Image Classification.基于遗传小波通道搜索的动态传统神经网络通道剪枝用于图像分类
Front Comput Neurosci. 2021 Oct 27;15:760554. doi: 10.3389/fncom.2021.760554. eCollection 2021.
3
Weak sub-network pruning for strong and efficient neural networks.弱子网络剪枝技术:构建强大而高效的神经网络
Neural Netw. 2021 Dec;144:614-626. doi: 10.1016/j.neunet.2021.09.015. Epub 2021 Sep 30.
4
StructADMM: Achieving Ultrahigh Efficiency in Structured Pruning for DNNs.结构化交替方向乘子法(StructADMM):在深度神经网络的结构化剪枝中实现超高效率
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2259-2273. doi: 10.1109/TNNLS.2020.3045153. Epub 2022 May 2.
5
Pruning Networks With Cross-Layer Ranking & k-Reciprocal Nearest Filters.基于跨层排序和k近邻互反滤波器的网络剪枝
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9139-9148. doi: 10.1109/TNNLS.2022.3156047. Epub 2023 Oct 27.
6
HRel: Filter pruning based on High Relevance between activation maps and class labels.HRel:基于激活图与类别标签之间的高相关性的滤波器修剪。
Neural Netw. 2022 Mar;147:186-197. doi: 10.1016/j.neunet.2021.12.017. Epub 2021 Dec 30.
7
Discrimination-Aware Network Pruning for Deep Model Compression.面向深度模型压缩的歧视感知网络剪枝。
IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):4035-4051. doi: 10.1109/TPAMI.2021.3066410. Epub 2022 Jul 1.
8
Asymptotic Soft Filter Pruning for Deep Convolutional Neural Networks.渐进式软滤波器剪枝在深度卷积神经网络中的应用。
IEEE Trans Cybern. 2020 Aug;50(8):3594-3604. doi: 10.1109/TCYB.2019.2933477. Epub 2019 Aug 27.
9
SAAF: Self-Adaptive Attention Factor-Based Taylor-Pruning on Convolutional Neural Networks.SAAF:基于自适应注意力因子的卷积神经网络泰勒剪枝法
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):8540-8553. doi: 10.1109/TNNLS.2024.3439435. Epub 2025 May 2.
10
Model pruning based on filter similarity for edge device deployment.基于滤波器相似度的模型剪枝用于边缘设备部署。
Front Neurorobot. 2023 Mar 2;17:1132679. doi: 10.3389/fnbot.2023.1132679. eCollection 2023.