Department of Radiology, The University of Chicago, 5841 South Maryland Avenue, MC2026, Chicago, Illinois 60637, USA.
Med Phys. 2009 Oct;36(10):4810-8. doi: 10.1118/1.3213517.
The purpose of this study was to investigate the effect of a noise injection method on the "overfitting" problem of artificial neural networks (ANNs) in two-class classification tasks. The authors compared ANNs trained with noise injection to ANNs trained with two other methods for avoiding overfitting: weight decay and early stopping. They also evaluated an automatic algorithm for selecting the magnitude of the noise injection. They performed simulation studies of an exclusive-or classification task with training datasets of 50, 100, and 200 cases (half normal and half abnormal) and an independent testing dataset of 2000 cases. They also compared the methods using a breast ultrasound dataset of 1126 cases. For simulated training datasets of 50 cases, the area under the receiver operating characteristic curve (AUC) was greater (by 0.03) when training with noise injection than when training without any regularization, and the improvement was greater than those from weight decay and early stopping (both of 0.02). For training datasets of 100 cases, noise injection and weight decay yielded similar increases in the AUC (0.02), whereas early stopping produced a smaller increase (0.01). For training datasets of 200 cases, the increases in the AUC were negligibly small for all methods (0.005). For the ultrasound dataset, noise injection had a greater average AUC than ANNs trained without regularization and a slightly greater average AUC than ANNs trained with weight decay. These results indicate that training ANNs with noise injection can reduce overfitting to a greater degree than early stopping and to a similar degree as weight decay.
本研究旨在探讨在二类分类任务中,噪声注入方法对人工神经网络(ANNs)“过拟合”问题的影响。作者将经过噪声注入训练的 ANN 与另外两种避免过拟合的方法(权值衰减和提前停止)训练的 ANN 进行了比较。他们还评估了一种自动算法,用于选择噪声注入的幅度。他们对异或分类任务进行了模拟研究,使用了 50、100 和 200 个案例(一半正常,一半异常)的训练数据集和 2000 个案例的独立测试数据集。他们还使用了 1126 个案例的乳腺超声数据集比较了这些方法。对于 50 个案例的模拟训练数据集,与不进行任何正则化的训练相比,经过噪声注入训练的接收者操作特性曲线(AUC)下面积(增加了 0.03)更大,并且改善程度大于权值衰减和提前停止(均为 0.02)。对于 100 个案例的训练数据集,噪声注入和权值衰减产生的 AUC 增加相似(0.02),而提前停止产生的增加较小(0.01)。对于 200 个案例的训练数据集,所有方法的 AUC 增加都可以忽略不计(0.005)。对于超声数据集,与未经正则化训练的 ANN 相比,噪声注入的平均 AUC 更大,与权值衰减训练的 ANN 相比,平均 AUC 略高。这些结果表明,与提前停止相比,用噪声注入训练 ANN 可以在更大程度上减少过拟合,与权值衰减的程度相似。