Suppr超能文献

结合多分支卷积神经网络和改进型变压器的进化神经架构搜索

Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer.

作者信息

Xu Yang, Ma Yongjie

机构信息

College of Physics and Electronic Engineering, Northwest Normal University, Lanzhou, 730070, China.

出版信息

Sci Rep. 2023 Sep 22;13(1):15791. doi: 10.1038/s41598-023-42931-3.

Abstract

Deep convolutional neural networks (CNNs) have achieved promising performance in the field of deep learning, but the manual design turns out to be very difficult due to the increasingly complex topologies of CNNs. Recently, neural architecture search (NAS) methods have been proposed to automatically design network architectures, which are superior to handcrafted counterparts. Unfortunately, most current NAS methods suffer from either highly computational complexity of generated architectures or limitations in the flexibility of architecture design. To address above issues, this article proposes an evolutionary neural architecture search (ENAS) method based on improved Transformer and multi-branch ConvNet. The multi-branch block enriches the feature space and enhances the representational capacity of a network by combining paths with different complexities. Since convolution is inherently a local operation, a simple yet powerful "batch-free normalization Transformer Block" (BFNTBlock) is proposed to leverage both local information and long-range feature dependencies. In particular, the design of batch-free normalization (BFN) and batch normalization (BN) mixed in the BFNTBlock blocks the accumulation of estimation shift ascribe to the stack of BN, which has favorable effects for performance improvement. The proposed method achieves remarkable accuracies, 97.24 [Formula: see text] and 80.06 [Formula: see text] on CIFAR10 and CIFAR100, respectively, with high computational efficiency, i.e. only 1.46 and 1.53 GPU days. To validate the universality of our method in application scenarios, the proposed algorithm is verified on two real-world applications, including the GTSRB and NEU-CLS dataset, and achieves a better performance than common methods.

摘要

深度卷积神经网络(CNN)在深度学习领域取得了令人瞩目的成绩,但由于CNN的拓扑结构日益复杂,手动设计变得非常困难。最近,神经架构搜索(NAS)方法被提出来自动设计网络架构,这些方法优于手工设计的架构。不幸的是,当前大多数NAS方法要么存在生成架构的计算复杂度高的问题,要么在架构设计的灵活性方面存在局限性。为了解决上述问题,本文提出了一种基于改进的Transformer和多分支卷积网络的进化神经架构搜索(ENAS)方法。多分支模块通过组合具有不同复杂度的路径来丰富特征空间并增强网络的表示能力。由于卷积本质上是一种局部操作,因此提出了一种简单而强大的“无批归一化Transformer模块”(BFNTBlock),以利用局部信息和长距离特征依赖关系。特别是,BFNTBlock中无批归一化(BFN)和批归一化(BN)混合的设计阻止了由于BN堆叠导致的估计偏移的累积,这对性能提升有积极影响。所提出的方法在CIFAR10和CIFAR100上分别取得了显著的准确率,即97.24 [公式:见原文] 和80.06 [公式:见原文],且计算效率高,即仅需1.46和1.53个GPU日。为了验证我们方法在应用场景中的通用性,该算法在两个实际应用中进行了验证,包括GTSRB和NEU-CLS数据集,并取得了比常用方法更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f656/10516961/276126c754e6/41598_2023_42931_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验