Apicella Andrea, Donnarumma Francesco, Isgrò Francesco, Prevete Roberto
Dipartimento di Ingegneria Elettrica e delle Tecnologie dell'Informazione, Università di Napoli Federico II, Italy.
Institute of Cognitive Sciences and Technologies (ISTC), National Research Council (CNR), Via San Martino della Battaglia 44, 00185 Rome, Italy.
Neural Netw. 2021 Jun;138:14-32. doi: 10.1016/j.neunet.2021.01.026. Epub 2021 Feb 9.
In neural networks literature, there is a strong interest in identifying and defining activation functions which can improve neural network performance. In recent years there has been a renovated interest in the scientific community in investigating activation functions which can be trained during the learning process, usually referred to as trainable, learnable or adaptable activation functions. They appear to lead to better network performance. Diverse and heterogeneous models of trainable activation function have been proposed in the literature. In this paper, we present a survey of these models. Starting from a discussion on the use of the term "activation function" in literature, we propose a taxonomy of trainable activation functions, highlight common and distinctive proprieties of recent and past models, and discuss main advantages and limitations of this type of approach. We show that many of the proposed approaches are equivalent to adding neuron layers which use fixed (non-trainable) activation functions and some simple local rule that constrains the corresponding weight layers.
在神经网络文献中,人们对识别和定义能够提高神经网络性能的激活函数有着浓厚的兴趣。近年来,科学界对研究在学习过程中可训练的激活函数(通常称为可训练、可学习或自适应激活函数)重新产生了兴趣。它们似乎能带来更好的网络性能。文献中已经提出了各种各样、异质的可训练激活函数模型。在本文中,我们对这些模型进行了综述。从讨论文献中“激活函数”一词的使用开始,我们提出了可训练激活函数的分类法,突出了近期和过去模型的共同和独特特性,并讨论了这种方法的主要优点和局限性。我们表明,许多提出的方法相当于添加了使用固定(不可训练)激活函数的神经元层以及一些约束相应权重层的简单局部规则。