Holmstrom L, Koistinen P
Rolf Nevanlinna Inst., Helsinki Univ.
IEEE Trans Neural Netw. 1992;3(1):24-38. doi: 10.1109/72.105415.
The possibility of improving the generalization capability of a neural network by introducing additive noise to the training samples is discussed. The network considered is a feedforward layered neural network trained with the back-propagation algorithm. Back-propagation training is viewed as nonlinear least-squares regression and the additive noise is interpreted as generating a kernel estimate of the probability density that describes the training vector distribution. Two specific application types are considered: pattern classifier networks and estimation of a nonstochastic mapping from data corrupted by measurement errors. It is not proved that the introduction of additive noise to the training vectors always improves network generalization. However, the analysis suggests mathematically justified rules for choosing the characteristics of noise if additive noise is used in training. Results of mathematical statistics are used to establish various asymptotic consistency results for the proposed method. Numerical simulations support the applicability of the training method.
讨论了通过向训练样本引入加性噪声来提高神经网络泛化能力的可能性。所考虑的网络是一个采用反向传播算法训练的前馈分层神经网络。反向传播训练被视为非线性最小二乘回归,加性噪声被解释为生成描述训练向量分布的概率密度的核估计。考虑了两种特定的应用类型:模式分类器网络和对受测量误差影响的数据进行非随机映射的估计。并未证明向训练向量引入加性噪声总能提高网络泛化能力。然而,分析给出了在训练中使用加性噪声时选择噪声特征的数学上合理的规则。利用数理统计结果为所提出的方法建立了各种渐近一致性结果。数值模拟支持了该训练方法的适用性。