IEEE Trans Cybern. 2022 Nov;52(11):11661-11671. doi: 10.1109/TCYB.2021.3078573. Epub 2022 Oct 17.
Although neural the architecture search (NAS) can bring improvement to deep models, it always neglects precious knowledge of existing models. The computation and time costing property in NAS also means that we should not start from scratch to search, but make every attempt to reuse the existing knowledge. In this article, we discuss what kind of knowledge in a model can and should be used for a new architecture design. Then, we propose a new NAS algorithm, namely, ModuleNet, which can fully inherit knowledge from the existing convolutional neural networks. To make full use of the existing models, we decompose existing models into different modules, which also keep their weights, consisting of a knowledge base. Then, we sample and search for a new architecture according to the knowledge base. Unlike previous search algorithms, and benefiting from inherited knowledge, our method is able to directly search for architectures in the macrospace by the NSGA-II algorithm without tuning parameters in these modules. Experiments show that our strategy can efficiently evaluate the performance of a new architecture even without tuning weights in convolutional layers. With the help of knowledge we inherited, our search results can always achieve better performance on various datasets (CIFAR10, CIFAR100, and ImageNet) over original architectures.
虽然神经架构搜索 (NAS) 可以为深度学习模型带来改进,但它总是忽略了现有模型的宝贵知识。NAS 的计算和时间成本特性也意味着我们不应该从头开始搜索,而是应该尽力利用现有的知识。在本文中,我们讨论了模型中的哪些知识可以并且应该用于新的架构设计。然后,我们提出了一种新的 NAS 算法,即 ModuleNet,它可以充分继承现有卷积神经网络的知识。为了充分利用现有的模型,我们将现有的模型分解成不同的模块,这些模块也保留了它们的权重,构成了一个知识库。然后,我们根据知识库对新的架构进行采样和搜索。与之前的搜索算法不同,我们的方法得益于继承的知识,可以直接通过 NSGA-II 算法在宏空间中搜索架构,而无需在这些模块中调整参数。实验表明,我们的策略即使不调整卷积层的权重,也能有效地评估新架构的性能。在我们所继承的知识的帮助下,我们的搜索结果在各种数据集(CIFAR10、CIFAR100 和 ImageNet)上都能始终超过原始架构,实现更好的性能。