IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):9225-9232. doi: 10.1109/TPAMI.2023.3235369. Epub 2023 Jun 5.
Batch normalization (BN) is a fundamental unit in modern deep neural networks. However, BN and its variants focus on normalization statistics but neglect the recovery step that uses linear transformation to improve the capacity of fitting complex data distributions. In this paper, we demonstrate that the recovery step can be improved by aggregating the neighborhood of each neuron rather than just considering a single neuron. Specifically, we propose a simple yet effective method named batch normalization with enhanced linear transformation (BNET) to embed spatial contextual information and improve representation ability. BNET can be easily implemented using the depth-wise convolution and seamlessly transplanted into existing architectures with BN. To our best knowledge, BNET is the first attempt to enhance the recovery step for BN. Furthermore, BN is interpreted as a special case of BNET from both spatial and spectral views. Experimental results demonstrate that BNET achieves consistent performance gains based on various backbones in a wide range of visual tasks. Moreover, BNET can accelerate the convergence of network training and enhance spatial information by assigning important neurons with large weights accordingly.
批量归一化(BN)是现代深度神经网络的基本单元。然而,BN 及其变体专注于归一化统计信息,但忽略了使用线性变换来提高拟合复杂数据分布能力的恢复步骤。在本文中,我们证明通过聚合每个神经元的邻域而不是仅仅考虑单个神经元,可以改进恢复步骤。具体来说,我们提出了一种简单而有效的方法,名为具有增强线性变换的批量归一化(BNET),以嵌入空间上下文信息并提高表示能力。BNET 可以使用深度卷积轻松实现,并可以与具有 BN 的现有架构无缝移植。据我们所知,BNET 是首次尝试增强 BN 的恢复步骤。此外,从空间和谱的角度来看,BN 被解释为 BNET 的一个特例。实验结果表明,BNET 在各种视觉任务中基于各种骨干网络都实现了一致的性能提升。此外,BNET 可以通过为重要神经元分配较大的权重来加速网络训练的收敛并增强空间信息。