IEEE Trans Neural Netw Learn Syst. 2016 May;27(5):978-92. doi: 10.1109/TNNLS.2015.2431251. Epub 2015 Jun 2.
The training of a multilayer perceptron neural network (MLPNN) concerns the selection of its architecture and the connection weights via the minimization of both the training error and a penalty term. Different penalty terms have been proposed to control the smoothness of the MLPNN for better generalization capability. However, controlling its smoothness using, for instance, the norm of weights or the Vapnik-Chervonenkis dimension cannot distinguish individual MLPNNs with the same number of free parameters or the same norm. In this paper, to enhance generalization capabilities, we propose a stochastic sensitivity measure (ST-SM) to realize a new penalty term for MLPNN training. The ST-SM determines the expectation of the squared output differences between the training samples and the unseen samples located within their Q -neighborhoods for a given MLPNN. It provides a direct measurement of the MLPNNs output fluctuations, i.e., smoothness. We adopt a two-phase Pareto-based multiobjective training algorithm for minimizing both the training error and the ST-SM as biobjective functions. Experiments on 20 UCI data sets show that the MLPNNs trained by the proposed algorithm yield better accuracies on testing data than several recent and classical MLPNN training methods.
多层感知机神经网络 (MLPNN) 的训练涉及通过最小化训练误差和惩罚项来选择其架构和连接权重。已经提出了不同的惩罚项来控制 MLPNN 的平滑度,以提高泛化能力。然而,通过使用权重的范数或 Vapnik-Chervonenkis 维数等来控制其平滑度,无法区分具有相同自由参数数量或相同范数的单个 MLPNN。在本文中,为了提高泛化能力,我们提出了一种随机敏感性度量 (ST-SM),以实现 MLPNN 训练的新惩罚项。ST-SM 确定了给定 MLPNN 中训练样本和未见过样本位于其 Q-邻域内的平方输出差异的期望。它提供了对 MLPNN 输出波动的直接度量,即平滑度。我们采用两阶段基于 Pareto 的多目标训练算法来最小化训练误差和 ST-SM 作为双目标函数。在 20 个 UCI 数据集上的实验表明,与几个最近和经典的 MLPNN 训练方法相比,通过所提出的算法训练的 MLPNN 在测试数据上具有更高的准确性。