IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5603-5612. doi: 10.1109/TNNLS.2020.2975051. Epub 2020 Nov 30.
We show that a neural network whose output is obtained as the difference of the outputs of two feedforward networks with exponential activation function in the hidden layer and logarithmic activation function in the output node, referred to as log-sum-exp (LSE) network, is a smooth universal approximator of continuous functions over convex, compact sets. By using a logarithmic transform, this class of network maps to a family of subtraction-free ratios of generalized posynomials (GPOS), which we also show to be universal approximators of positive functions over log-convex, compact subsets of the positive orthant. The main advantage of difference-LSE networks with respect to classical feedforward neural networks is that, after a standard training phase, they provide surrogate models for a design that possesses a specific difference-of-convex-functions form, which makes them optimizable via relatively efficient numerical methods. In particular, by adapting an existing difference-of-convex algorithm to these models, we obtain an algorithm for performing an effective optimization-based design. We illustrate the proposed approach by applying it to the data-driven design of a diet for a patient with type-2 diabetes and to a nonconvex optimization problem.
我们证明了一种神经网络,其输出是通过将具有指数激活函数的前馈网络的输出与具有对数激活函数的输出节点的输出之差获得的,称为对数和 - 指数(LSE)网络,是凸、紧集上连续函数的平滑通用逼近器。通过使用对数变换,该网络类映射到广义正多项式(GPOS)的无减法比的族,我们还证明了它们是正半轴上对数凸、紧子集上正函数的通用逼近器。具有差 - LSE 网络的主要优点相对于经典前馈神经网络在于,在标准训练阶段之后,它们为具有特定凸函数差形式的设计提供了替代模型,这使得它们可以通过相对有效的数值方法进行优化。特别是,通过将现有的凸函数差算法应用于这些模型,我们得到了一种基于有效优化的设计的算法。我们通过将其应用于 2 型糖尿病患者的饮食的基于数据的设计和非凸优化问题来说明所提出的方法。