Suppr超能文献

用于卷积神经网络混合变量超参数优化的代理辅助分布估计算法的混合模型估计

Surrogate-Assisted Hybrid-Model Estimation of Distribution Algorithm for Mixed-Variable Hyperparameters Optimization in Convolutional Neural Networks.

作者信息

Li Jian-Yu, Zhan Zhi-Hui, Xu Jin, Kwong Sam, Zhang Jun

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2338-2352. doi: 10.1109/TNNLS.2021.3106399. Epub 2023 May 2.

Abstract

The performance of a convolutional neural network (CNN) heavily depends on its hyperparameters. However, finding a suitable hyperparameters configuration is difficult, challenging, and computationally expensive due to three issues, which are 1) the mixed-variable problem of different types of hyperparameters; 2) the large-scale search space of finding optimal hyperparameters; and 3) the expensive computational cost for evaluating candidate hyperparameters configuration. Therefore, this article focuses on these three issues and proposes a novel estimation of distribution algorithm (EDA) for efficient hyperparameters optimization, with three major contributions in the algorithm design. First, a hybrid-model EDA is proposed to efficiently deal with the mixed-variable difficulty. The proposed algorithm uses a mixed-variable encoding scheme to encode the mixed-variable hyperparameters and adopts an adaptive hybrid-model learning (AHL) strategy to efficiently optimize the mixed-variables. Second, an orthogonal initialization (OI) strategy is proposed to efficiently deal with the challenge of large-scale search space. Third, a surrogate-assisted multi-level evaluation (SME) method is proposed to reduce the expensive computational cost. Based on the above, the proposed algorithm is named s urrogate-assisted hybrid-model EDA (SHEDA). For experimental studies, the proposed SHEDA is verified on widely used classification benchmark problems, and is compared with various state-of-the-art methods. Moreover, a case study on aortic dissection (AD) diagnosis is carried out to evaluate its performance. Experimental results show that the proposed SHEDA is very effective and efficient for hyperparameters optimization, which can find a satisfactory hyperparameters configuration for the CIFAR10, CIFAR100, and AD diagnosis with only 0.58, 0.97, and 1.18 GPU days, respectively.

摘要

卷积神经网络(CNN)的性能在很大程度上取决于其超参数。然而,由于三个问题,找到合适的超参数配置既困难又具有挑战性,而且计算成本高昂。这三个问题分别是:1)不同类型超参数的混合变量问题;2)寻找最优超参数的大规模搜索空间;3)评估候选超参数配置的高昂计算成本。因此,本文聚焦于这三个问题,提出了一种新颖的分布估计算法(EDA)用于高效的超参数优化,在算法设计上有三个主要贡献。首先,提出了一种混合模型EDA来有效处理混合变量难题。所提出的算法使用混合变量编码方案对混合变量超参数进行编码,并采用自适应混合模型学习(AHL)策略来有效优化混合变量。其次,提出了一种正交初始化(OI)策略来有效应对大规模搜索空间的挑战。第三,提出了一种代理辅助多级评估(SME)方法来降低高昂的计算成本。基于以上内容,所提出的算法被命名为代理辅助混合模型EDA(SHEDA)。对于实验研究,所提出的SHEDA在广泛使用的分类基准问题上进行了验证,并与各种先进方法进行了比较。此外,还进行了一项关于主动脉夹层(AD)诊断的案例研究以评估其性能。实验结果表明,所提出的SHEDA对于超参数优化非常有效且高效,它分别仅需0.58、0.97和1.18个GPU天数就能为CIFAR10、CIFAR100和AD诊断找到令人满意的超参数配置。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验