Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu City, 30010, Taiwan
Evol Comput. 2024 Jun 3;32(2):177-204. doi: 10.1162/evco_a_00331.
Evolution-based neural architecture search methods have shown promising results, but they require high computational resources because these methods involve training each candidate architecture from scratch and then evaluating its fitness, which results in long search time. Covariance Matrix Adaptation Evolution Strategy (CMA-ES) has shown promising results in tuning hyperparameters of neural networks but has not been used for neural architecture search. In this work, we propose a framework called CMANAS which applies the faster convergence property of CMA-ES to the deep neural architecture search problem. Instead of training each individual architecture seperately, we used the accuracy of a trained one shot model (OSM) on the validation data as a prediction of the fitness of the architecture, resulting in reduced search time. We also used an architecture-fitness table (AF table) for keeping a record of the already evaluated architecture, thus further reducing the search time. The architectures are modeled using a normal distribution, which is updated using CMA-ES based on the fitness of the sampled population. Experimentally, CMANAS achieves better results than previous evolution-based methods while reducing the search time significantly. The effectiveness of CMANAS is shown on two different search spaces using four datasets: CIFAR-10, CIFAR-100, ImageNet, and ImageNet16-120. All the results show that CMANAS is a viable alternative to previous evolution-based methods and extends the application of CMA-ES to the deep neural architecture search field.
基于进化的神经架构搜索方法已经显示出了很有前景的结果,但它们需要大量的计算资源,因为这些方法涉及到从零开始训练每个候选架构,然后评估其适应性,这导致了搜索时间的延长。协方差矩阵适应进化策略(CMA-ES)在调优神经网络的超参数方面已经显示出了很有前景的结果,但尚未用于神经架构搜索。在这项工作中,我们提出了一个名为 CMANAS 的框架,该框架将 CMA-ES 的更快收敛特性应用于深度神经架构搜索问题。我们不是单独训练每个个体架构,而是使用在验证数据上训练的单次模型(OSM)的准确性作为架构适应性的预测,从而减少了搜索时间。我们还使用了架构适应性表(AF 表)来记录已经评估过的架构,从而进一步减少了搜索时间。架构使用正态分布进行建模,该分布根据抽样种群的适应性使用 CMA-ES 进行更新。实验表明,CMANAS 在减少搜索时间的同时,取得了比以前基于进化的方法更好的结果。CMANAS 在使用四个数据集(CIFAR-10、CIFAR-100、ImageNet 和 ImageNet16-120)的两个不同搜索空间上的有效性得到了证明。所有结果都表明,CMANAS 是以前基于进化的方法的一个可行替代品,并将 CMA-ES 的应用扩展到了深度神经架构搜索领域。