• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Arch-Net:用于与架构无关的模型部署的模型转换和量化。

Arch-Net: Model conversion and quantization for architecture agnostic model deployment.

作者信息

Fang Shuangkang, Xu Weixin, Feng Zipeng, Yuan Song, Wang Yufeng, Yang Yi, Ding Wenrui, Zhou Shuchang

机构信息

School of Electrical and Information Engineering, Beihang University, Beijing, 100191, China.

Megvii Research, Megvii Inc., Bejing, 100096, China.

出版信息

Neural Netw. 2025 Jul;187:107384. doi: 10.1016/j.neunet.2025.107384. Epub 2025 Mar 18.

DOI:10.1016/j.neunet.2025.107384
PMID:40120552
Abstract

The significant computational demands of Deep Neural Networks (DNNs) present a major challenge for their practical application. Recently, many Application-Specific Integrated Circuit (ASIC) chips have incorporated dedicated hardware support for neural network acceleration. However, the lengthy development cycle of ASIC chips means they often lag behind the latest advances in neural architecture research. For instance, Layer Normalization is not well-supported on many popular chips, and the efficiency of 7 × 7 convolution is significantly lower than the equivalent three 3 × 3 convolution. Therefore, in this paper, we introduce Arch-Net, a neural network framework comprised exclusively of a select few common operators, namely 3 × 3 Convolution, 2 × 2 Max-pooling, Batch Normalization, Fully Connected layers, and Concatenation, which are efficiently supported across the majority of ASIC architectures. To facilitate the conversion of disparate network architectures into Arch-Net, we propose the Arch-Distillation methodology, which incorporates strategies such as Residual Feature Adaptation and Teacher Attention Mechanism. These mechanisms enable effective conversion between different network structures alongside efficient model quantization. The resultant Arch-Net eliminates unconventional network constructs while maintaining robust performance even under sub-8-bit quantization, thereby enhancing compatibility and deployment efficiency. Empirical results from image classification and machine translation tasks demonstrate that using only a few types of operators in Arch-Net can achieve results comparable to those obtained with complex architectures. This provides a new insight for deploying structure-agnostic neural networks on various ASIC chips.

摘要

深度神经网络(DNN)巨大的计算需求对其实际应用构成了重大挑战。最近,许多专用集成电路(ASIC)芯片都集成了对神经网络加速的专用硬件支持。然而,ASIC芯片漫长的开发周期意味着它们往往落后于神经架构研究的最新进展。例如,许多流行芯片对层归一化(Layer Normalization)的支持并不好,并且7×7卷积的效率明显低于等效的三个3×3卷积。因此,在本文中,我们引入了Arch-Net,这是一种神经网络框架,仅由少数几个常见算子组成,即3×3卷积、2×2最大池化、批量归一化、全连接层和拼接,这些算子在大多数ASIC架构中都得到了有效支持。为了便于将不同的网络架构转换为Arch-Net,我们提出了Arch-Distillation方法,该方法纳入了诸如残差特征适应和教师注意力机制等策略。这些机制能够在不同网络结构之间进行有效转换,并实现高效的模型量化。由此产生的Arch-Net消除了非常规的网络结构,同时即使在低于8位量化的情况下也能保持强大的性能,从而提高了兼容性和部署效率。图像分类和机器翻译任务的实证结果表明,在Arch-Net中仅使用几种类型的算子就能取得与复杂架构相当的结果。这为在各种ASIC芯片上部署与结构无关的神经网络提供了新的见解。

相似文献

1
Arch-Net: Model conversion and quantization for architecture agnostic model deployment.Arch-Net:用于与架构无关的模型部署的模型转换和量化。
Neural Netw. 2025 Jul;187:107384. doi: 10.1016/j.neunet.2025.107384. Epub 2025 Mar 18.
2
A memristive all-inclusive hypernetwork for parallel analog deployment of full search space architectures.一种用于全搜索空间架构并行模拟部署的忆阻全包容超网络。
Neural Netw. 2024 Jul;175:106312. doi: 10.1016/j.neunet.2024.106312. Epub 2024 Apr 15.
3
Quantization Friendly MobileNet (QF-MobileNet) Architecture for Vision Based Applications on Embedded Platforms.面向嵌入式平台视觉应用的量化友好型 MobileNet(QF-MobileNet)架构。
Neural Netw. 2021 Apr;136:28-39. doi: 10.1016/j.neunet.2020.12.022. Epub 2020 Dec 29.
4
SCA: Search-Based Computing Hardware Architecture with Precision Scalable and Computation Reconfigurable Scheme.基于搜索的计算硬件架构,具有精确可扩展和计算可重构方案。
Sensors (Basel). 2022 Nov 6;22(21):8545. doi: 10.3390/s22218545.
5
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks.从算法到硬件:深度神经网络高效安全部署综述
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):5837-5857. doi: 10.1109/TNNLS.2024.3394494. Epub 2025 Apr 4.
6
Optimization-Based Post-Training Quantization With Bit-Split and Stitching.基于优化的带位分割与拼接的训练后量化
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):2119-2135. doi: 10.1109/TPAMI.2022.3159369. Epub 2023 Jan 6.
7
MADR-Net: multi-level attention dilated residual neural network for segmentation of medical images.MADR-Net:用于医学图像分割的多层次注意扩张残差神经网络。
Sci Rep. 2024 Jun 3;14(1):12699. doi: 10.1038/s41598-024-63538-2.
8
A Post-training Quantization Method for the Design of Fixed-Point-Based FPGA/ASIC Hardware Accelerators for LSTM/GRU Algorithms.一种针对 LSTM/GRU 算法的基于定点的 FPGA/ASIC 硬件加速器设计的后训练量化方法。
Comput Intell Neurosci. 2022 May 11;2022:9485933. doi: 10.1155/2022/9485933. eCollection 2022.
9
ASD-Net: a novel U-Net based asymmetric spatial-channel convolution network for precise kidney and kidney tumor image segmentation.ASD-Net:一种新颖的基于 U-Net 的非对称空间-通道卷积网络,用于精确的肾脏和肾肿瘤图像分割。
Med Biol Eng Comput. 2024 Jun;62(6):1673-1687. doi: 10.1007/s11517-024-03025-y. Epub 2024 Feb 8.
10
AresB-Net: accurate residual binarized neural networks using shortcut concatenation and shuffled grouped convolution.AresB-Net:使用捷径拼接和混洗分组卷积的精确残差二值化神经网络。
PeerJ Comput Sci. 2021 Mar 26;7:e454. doi: 10.7717/peerj-cs.454. eCollection 2021.