Schlör Daniel, Ring Markus, Hotho Andreas
Data Science Chair, Institute of Computer Science, University of Wuerzburg, Würzburg, Germany.
Department of Electrical Engineering and Computer Science, University of Applied Sciences and Arts Coburg, Coburg, Germany.
Front Artif Intell. 2020 Sep 29;3:71. doi: 10.3389/frai.2020.00071. eCollection 2020.
Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication. Although NALUs have been shown to perform well on various downstream tasks, an in-depth analysis reveals practical shortcomings by design, such as the inability to multiply or divide negative input values or training stability issues for deeper networks. We address these issues and propose an improved model architecture. We evaluate our model empirically in various settings from learning basic arithmetic operations to more complex functions. Our experiments indicate that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence.
神经网络必须捕捉数学关系以便学习各种任务。它们隐式地近似这些关系,因此通常泛化能力不佳。最近提出的神经算术逻辑单元(NALU)是一种新颖的神经架构,它能够通过网络单元明确表示数学关系,以学习诸如加法、减法或乘法等运算。尽管NALU已被证明在各种下游任务中表现良好,但深入分析发现其在设计上存在实际缺点,例如无法对负输入值进行乘法或除法运算,或者对于更深层次的网络存在训练稳定性问题。我们解决了这些问题并提出了一种改进的模型架构。我们在从学习基本算术运算到更复杂函数的各种设置中对我们的模型进行了实证评估。我们的实验表明,我们的模型解决了稳定性问题,并且在算术精度和收敛性方面优于原始的NALU模型。