通过神经结构转换实现精确且紧凑的架构。

Towards Accurate and Compact Architectures via Neural Architecture Transformer.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6501-6516. doi: 10.1109/TPAMI.2021.3086914. Epub 2022 Sep 14.

DOI:10.1109/TPAMI.2021.3086914

Abstract

Designing effective architectures is one of the key factors behind the success of deep neural networks. Existing deep architectures are either manually designed or automatically searched by some Neural Architecture Search (NAS) methods. However, even a well-designed/searched architecture may still contain many nonsignificant or redundant modules/operations (e.g., some intermediate convolution or pooling layers). Such redundancy may not only incur substantial memory consumption and computational cost but also deteriorate the performance. Thus, it is necessary to optimize the operations inside an architecture to improve the performance without introducing extra computational cost. To this end, we have proposed a Neural Architecture Transformer (NAT) method which casts the optimization problem into a Markov Decision Process (MDP) and seeks to replace the redundant operations with more efficient operations, such as skip or null connection. Note that NAT only considers a small number of possible replacements/transitions and thus comes with a limited search space. As a result, such a small search space may hamper the performance of architecture optimization. To address this issue, we propose a Neural Architecture Transformer++ (NAT++) method which further enlarges the set of candidate transitions to improve the performance of architecture optimization. Specifically, we present a two-level transition rule to obtain valid transitions, i.e., allowing operations to have more efficient types (e.g., convolution → separable convolution) or smaller kernel sizes (e.g., 5×5 → 3×3). Note that different operations may have different valid transitions. We further propose a Binary-Masked Softmax (BMSoftmax) layer to omit the possible invalid transitions. Last, based on the MDP formulation, we apply policy gradient to learn an optimal policy, which will be used to infer the optimized architectures. Extensive experiments show that the transformed architectures significantly outperform both their original counterparts and the architectures optimized by existing methods.

摘要

设计有效的架构是深度学习网络成功的关键因素之一。现有的深度架构要么是由一些神经架构搜索（NAS）方法手动设计，要么是自动搜索得到的。然而，即使是精心设计/搜索的架构仍然可能包含许多非重要或冗余的模块/操作（例如，一些中间卷积或池化层）。这种冗余不仅会导致大量的内存消耗和计算成本，而且还会降低性能。因此，有必要优化架构中的操作，在不引入额外计算成本的情况下提高性能。为此，我们提出了一种神经架构转换器（NAT）方法，将优化问题转化为马尔可夫决策过程（MDP），并寻求用更有效的操作（例如跳过或空连接）替换冗余操作。请注意，NAT 只考虑了少量可能的替换/转换，因此搜索空间有限。因此，如此小的搜索空间可能会阻碍架构优化的性能。为了解决这个问题，我们提出了一种神经架构转换器++（NAT++）方法，进一步扩大了候选转换的集合，以提高架构优化的性能。具体来说，我们提出了一种两级转换规则来获得有效的转换，即允许操作具有更有效的类型（例如，卷积→可分离卷积）或更小的核大小（例如，5×5→3×3）。请注意，不同的操作可能具有不同的有效转换。我们进一步提出了一种二进制掩蔽 Softmax（BMSoftmax）层来忽略可能的无效转换。最后，基于 MDP 公式，我们应用策略梯度来学习最优策略，该策略将用于推断优化后的架构。大量实验表明，转换后的架构明显优于其原始架构和现有方法优化的架构。

相似文献

Towards Accurate and Compact Architectures via Neural Architecture Transformer.通过神经结构转换实现精确且紧凑的架构。

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6501-6516. doi: 10.1109/TPAMI.2021.3086914. Epub 2022 Sep 14.

Content-aware convolutional neural networks.基于内容感知的卷积神经网络。

Neural Netw. 2021 Nov;143:657-668. doi: 10.1016/j.neunet.2021.06.030. Epub 2021 Jul 12.

Neural architecture search based on dual attention mechanism for image classification.基于双注意力机制的图像分类神经架构搜索。

Math Biosci Eng. 2023 Jan;20(2):2691-2715. doi: 10.3934/mbe.2023126. Epub 2022 Nov 28.

Efficient spiking neural network design via neural architecture search.通过神经结构搜索进行高效的尖峰神经网络设计。

Neural Netw. 2024 May;173:106172. doi: 10.1016/j.neunet.2024.106172. Epub 2024 Feb 16.

EMONAS-Net: Efficient multiobjective neural architecture search using surrogate-assisted evolutionary algorithm for 3D medical image segmentation.EMONAS-Net：基于代理辅助进化算法的高效多目标神经架构搜索在 3D 医学图像分割中的应用。

Artif Intell Med. 2021 Sep;119:102154. doi: 10.1016/j.artmed.2021.102154. Epub 2021 Aug 24.

AUTO-HAR: An adaptive human activity recognition framework using an automated CNN architecture design.AUTO-HAR：一种使用自动化卷积神经网络（CNN）架构设计的自适应人类活动识别框架。

Heliyon. 2023 Feb 13;9(2):e13636. doi: 10.1016/j.heliyon.2023.e13636. eCollection 2023 Feb.

One-Shot Neural Architecture Search by Dynamically Pruning Supernet in Hierarchical Order.分层动态剪枝超网的单步神经架构搜索。

Int J Neural Syst. 2021 Jul;31(7):2150029. doi: 10.1142/S0129065721500295. Epub 2021 Jun 14.

Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.优化神经网络在医学数据集上的应用：以新生儿呼吸暂停预测为例的研究

Artif Intell Med. 2019 Jul;98:59-76. doi: 10.1016/j.artmed.2019.07.008. Epub 2019 Jul 25.

ESC-NAS: Environment Sound Classification Using Hardware-Aware Neural Architecture Search for the Edge.ESC-NAS：利用硬件感知神经架构搜索进行边缘环境声音分类

Sensors (Basel). 2024 Jun 9;24(12):3749. doi: 10.3390/s24123749.

Disturbance-immune weight sharing for neural architecture search.抗干扰权重共享的神经架构搜索。

Neural Netw. 2021 Dec;144:553-564. doi: 10.1016/j.neunet.2021.09.002. Epub 2021 Sep 23.

引用本文的文献

Machine Learning and Deep Learning Hybrid Approach Based on Muscle Imaging Features for Diagnosis of Esophageal Cancer.基于肌肉成像特征的机器学习与深度学习混合方法用于食管癌诊断

Diagnostics (Basel). 2025 Jul 8;15(14):1730. doi: 10.3390/diagnostics15141730.

Advances in neural architecture search.神经架构搜索的进展。

Natl Sci Rev. 2024 Aug 23;11(8):nwae282. doi: 10.1093/nsr/nwae282. eCollection 2024 Aug.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过神经结构转换实现精确且紧凑的架构。

Towards Accurate and Compact Architectures via Neural Architecture Transformer.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献