MNGNAS：用于一次性神经架构搜索的多个搜索网络的自适应组合提取

MNGNAS: Distilling Adaptive Combination of Multiple Searched Networks for One-Shot Neural Architecture Search.

作者信息

Chen Zhihua, Qiu Guhao, Li Ping, Zhu Lei, Yang Xiaokang, Sheng Bin

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13489-13508. doi: 10.1109/TPAMI.2023.3293885. Epub 2023 Oct 3.

DOI:10.1109/TPAMI.2023.3293885

Abstract

Recently neural architecture (NAS) search has attracted great interest in academia and industry. It remains a challenging problem due to the huge search space and computational costs. Recent studies in NAS mainly focused on the usage of weight sharing to train a SuperNet once. However, the corresponding branch of each subnetwork is not guaranteed to be fully trained. It may not only incur huge computation costs but also affect the architecture ranking in the retraining procedure. We propose a multi-teacher-guided NAS, which proposes to use the adaptive ensemble and perturbation-aware knowledge distillation algorithm in the one-shot-based NAS algorithm. The optimization method aiming to find the optimal descent directions is used to obtain adaptive coefficients for the feature maps of the combined teacher model. Besides, we propose a specific knowledge distillation process for optimal architectures and perturbed ones in each searching process to learn better feature maps for later distillation procedures. Comprehensive experiments verify our approach is flexible and effective. We show improvement in precision and search efficiency in the standard recognition dataset. We also show improvement in correlation between the accuracy of the search algorithm and true accuracy by NAS benchmark datasets.

摘要

最近，神经架构搜索（NAS）在学术界和工业界引起了极大的兴趣。由于巨大的搜索空间和计算成本，它仍然是一个具有挑战性的问题。NAS领域最近的研究主要集中在使用权重共享一次性训练一个超网络。然而，每个子网的相应分支并不能保证得到充分训练。这不仅可能产生巨大的计算成本，还可能影响再训练过程中的架构排名。我们提出了一种多教师引导的NAS，该方法建议在基于一次性的NAS算法中使用自适应集成和扰动感知知识蒸馏算法。旨在找到最优下降方向的优化方法用于为组合教师模型的特征图获得自适应系数。此外，我们为每个搜索过程中的最优架构和受扰动架构提出了一个特定的知识蒸馏过程，以便为后续的蒸馏过程学习更好的特征图。综合实验验证了我们的方法灵活且有效。我们在标准识别数据集中展示了精度和搜索效率的提高。我们还通过NAS基准数据集展示了搜索算法的准确率与真实准确率之间的相关性有所提高。