PWLU：使用分段线性单元学习特定激活函数

PWLU: Learning Specialized Activation Functions With the Piecewise Linear Unit.

作者信息

Zhu Zezhou, Zhou Yucong, Dong Yuan, Zhong Zhao

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12269-12286. doi: 10.1109/TPAMI.2023.3286109. Epub 2023 Sep 5.

DOI:10.1109/TPAMI.2023.3286109

Abstract

The choice of activation functions is crucial to deep neural networks. ReLU is a popular hand-designed activation function. Swish, the automatically searched activation function, outperforms ReLU on many challenging datasets. However, the search method has two main drawbacks. First, the tree-based search space is highly discrete and restricted, which is difficult to search. Second, the sample-based search method is inefficient in finding specialized activation functions for each dataset or neural architecture. To overcome these drawbacks, we propose a new activation function called Piecewise Linear Unit (PWLU), incorporating a carefully designed formulation and learning method. PWLU can learn specialized activation functions for different models, layers, or channels. Besides, we propose a non-uniform version of PWLU, which maintains sufficient flexibility but requires fewer intervals and parameters. Additionally, we generalize PWLU to three-dimensional space to define a piecewise linear surface named 2D-PWLU, which can be treated as a non-linear binary operator. Experimental results show that PWLU achieves SOTA performance on various tasks and models, and 2D-PWLU is better than element-wise addition when aggregating features from different branches. The proposed PWLU and its variation are easy to implement and efficient for inference, which can be widely applied in real-world applications.

摘要

激活函数的选择对深度神经网络至关重要。ReLU是一种流行的手工设计激活函数。自动搜索的激活函数Swish在许多具有挑战性的数据集上优于ReLU。然而，搜索方法有两个主要缺点。首先，基于树的搜索空间高度离散且受限，难以搜索。其次，基于样本的搜索方法在为每个数据集或神经架构寻找专门的激活函数时效率低下。为了克服这些缺点，我们提出了一种名为分段线性单元（PWLU）的新激活函数，它结合了精心设计的公式和学习方法。PWLU可以为不同的模型、层或通道学习专门的激活函数。此外，我们提出了PWLU的非均匀版本，它保持了足够的灵活性，但需要的区间和参数更少。此外，我们将PWLU推广到三维空间，定义了一个名为2D-PWLU的分段线性曲面，它可以被视为一个非线性二元运算符。实验结果表明，PWLU在各种任务和模型上实现了最优性能，并且在聚合来自不同分支的特征时，2D-PWLU比逐元素加法更好。所提出的PWLU及其变体易于实现且推理效率高，可广泛应用于实际应用中。

相似文献

PWLU: Learning Specialized Activation Functions With the Piecewise Linear Unit.

IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12269-12286. doi: 10.1109/TPAMI.2023.3286109. Epub 2023 Sep 5.

Optimization of Microchannels and Application of Basic Activation Functions of Deep Neural Network for Accuracy Analysis of Microfluidic Parameter Data.

Micromachines (Basel). 2022 Aug 20;13(8):1352. doi: 10.3390/mi13081352.

ReLU Networks Are Universal Approximators via Piecewise Linear or Constant Functions.

Neural Comput. 2020 Nov;32(11):2249-2278. doi: 10.1162/neco_a_01316. Epub 2020 Sep 18.

Optimal approximation of piecewise smooth functions using deep ReLU neural networks.

Neural Netw. 2018 Dec;108:296-330. doi: 10.1016/j.neunet.2018.08.019. Epub 2018 Sep 7.

Discovering Parametric Activation Functions.

Neural Netw. 2022 Apr;148:48-65. doi: 10.1016/j.neunet.2022.01.001. Epub 2022 Jan 7.

Sparseness Analysis in the Pretraining of Deep Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2017 Jun;28(6):1425-1438. doi: 10.1109/TNNLS.2016.2541681. Epub 2016 Mar 31.

Adaptively Customizing Activation Functions for Various Layers.

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):6096-6107. doi: 10.1109/TNNLS.2021.3133263. Epub 2023 Sep 1.

Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.

IEEE Trans Image Process. 2018;27(1):106-120. doi: 10.1109/TIP.2017.2755766.

Parametric Deformable Exponential Linear Units for deep neural networks.

Neural Netw. 2020 May;125:281-289. doi: 10.1016/j.neunet.2020.02.012. Epub 2020 Feb 26.

Neural architecture search based on dual attention mechanism for image classification.

Math Biosci Eng. 2023 Jan;20(2):2691-2715. doi: 10.3934/mbe.2023126. Epub 2022 Nov 28.

引用本文的文献

Enhancing the Distinguishability of Minor Fluctuations in Time Series Classification Using Graph Representation: The MFSI-TSC Framework.

Sensors (Basel). 2025 Jul 29;25(15):4672. doi: 10.3390/s25154672.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

PWLU：使用分段线性单元学习特定激活函数

PWLU: Learning Specialized Activation Functions With the Piecewise Linear Unit.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献