Department of Information Enginering, University of Padua, viale Gradenigo 6, 35131 Padua, Italy.
DISI, Università di Bologna, Via dell'università 50, 47521 Cesena, Italy.
Sensors (Basel). 2020 Mar 14;20(6):1626. doi: 10.3390/s20061626.
In recent years, the field of deep learning has achieved considerable success in pattern recognition, image segmentation, and many other classification fields. There are many studies and practical applications of deep learning on images, video, or text classification. Activation functions play a crucial role in discriminative capabilities of the deep neural networks and the design of new "static" or "dynamic" activation functions is an active area of research. The main difference between "static" and "dynamic" functions is that the first class of activations considers all the neurons and layers as identical, while the second class learns parameters of the activation function independently for each layer or even each neuron. Although the "dynamic" activation functions perform better in some applications, the increased number of trainable parameters requires more computational time and can lead to overfitting. In this work, we propose a mixture of "static" and "dynamic" activation functions, which are stochastically selected at each layer. Our idea for model design is based on a method for changing some layers along the lines of different functional blocks of the best performing CNN models, with the aim of designing new models to be used as stand-alone networks or as a component of an ensemble. We propose to replace each activation layer of a CNN (usually a ReLU layer) by a different activation function stochastically drawn from a set of activation functions: in this way, the resulting CNN has a different set of activation function layers.
近年来,深度学习领域在模式识别、图像分割和许多其他分类领域取得了相当大的成功。深度学习在图像、视频或文本分类方面有很多研究和实际应用。激活函数在深度神经网络的判别能力中起着至关重要的作用,设计新的“静态”或“动态”激活函数是一个活跃的研究领域。“静态”和“动态”函数的主要区别在于,第一类激活函数认为所有神经元和层都是相同的,而第二类激活函数则为每个层甚至每个神经元独立学习激活函数的参数。尽管“动态”激活函数在某些应用中表现更好,但增加的可训练参数需要更多的计算时间,并可能导致过拟合。在这项工作中,我们提出了一种“静态”和“动态”激活函数的混合体,这些函数在每个层上都是随机选择的。我们的模型设计思想基于一种沿着表现最好的 CNN 模型的不同功能块改变某些层的方法,目的是设计新的模型,作为独立网络或集成的一部分使用。我们建议用一组激活函数中随机抽取的不同激活函数来替代 CNN 的每个激活层(通常是 ReLU 层):通过这种方式,生成的 CNN 具有不同的激活函数层集。