IEEE Trans Neural Netw Learn Syst. 2012 Feb;23(2):211-22. doi: 10.1109/TNNLS.2011.2178477.
Improving fault tolerance of a neural network has been studied for more than two decades. Various training algorithms have been proposed in sequel. The on-line node fault injection-based algorithm is one of these algorithms, in which hidden nodes randomly output zeros during training. While the idea is simple, theoretical analyses on this algorithm are far from complete. This paper presents its objective function and the convergence proof. We consider three cases for multilayer perceptrons (MLPs). They are: (1) MLPs with single linear output node; (2) MLPs with multiple linear output nodes; and (3) MLPs with single sigmoid output node. For the convergence proof, we show that the algorithm converges with probability one. For the objective function, we show that the corresponding objective functions of cases (1) and (2) are of the same form. They both consist of a mean square errors term, a regularizer term, and a weight decay term. For case (3), the objective function is slight different from that of cases (1) and (2). With the objective functions derived, we can compare the similarities and differences among various algorithms and various cases.
二十多年来,人们一直在研究提高神经网络的容错能力。随后提出了各种训练算法。基于在线节点故障注入的算法就是其中之一,在该算法中,隐藏节点在训练过程中随机输出零。虽然这个想法很简单,但对该算法的理论分析还远远不够。本文提出了它的目标函数和收敛证明。我们考虑了三种多层感知器(MLP)的情况:(1)具有单个线性输出节点的 MLP;(2)具有多个线性输出节点的 MLP;以及(3)具有单个 Sigmoid 输出节点的 MLP。对于收敛证明,我们表明该算法以概率 1 收敛。对于目标函数,我们表明情况(1)和(2)的相应目标函数具有相同的形式。它们都由均方误差项、正则化项和权重衰减项组成。对于情况(3),目标函数与情况(1)和(2)略有不同。有了推导出来的目标函数,我们就可以比较各种算法和各种情况下的相似点和不同点。