Ying Hejie, Song Mengmeng, Tang Yaohong, Xiao Shungen, Xiao Zimin
Ningde Normal University, No. 1 College Road, Ningde, 352101, FuJian, China.
New Energy Vehicle Motor Industry Technology Development Base, Ningde Normal University, No. 1 College Road, Ningde, 352101, FuJian, China.
Sci Rep. 2024 Jul 2;14(1):15197. doi: 10.1038/s41598-024-65691-0.
Deep neural networks have achieved remarkable success in various fields. However, training an effective deep neural network still poses challenges. This paper aims to propose a method to optimize the training effectiveness of deep neural networks, with the goal of improving their performance. Firstly, based on the observation that parameters (weights and bias) of deep neural network change in certain rules during training process, the potential of parameters prediction for improving training efficiency is discovered. Secondly, the potential of parameters prediction to improve the performance of deep neural network by noise injection introduced by prediction errors is revealed. And then, considering the limitations comprehensively, a deep neural network Parameters Linear Prediction method is exploit. Finally, performance and hyperparameter sensitivity validations are carried out on some representative backbones. Experimental results show that by employing proposed Parameters Linear Prediction method, as opposed to SGD, has led to an approximate 1% increase in accuracy for optimal model, along with a reduction of about 0.01 in top-1/top-5 error. Moreover, it also exhibits stable performance under various hyperparameter settings, shown the effectiveness of the proposed method and validated its capacity in enhancing network's training efficiency and performance.
深度神经网络在各个领域都取得了显著成功。然而,训练一个有效的深度神经网络仍然面临挑战。本文旨在提出一种方法来优化深度神经网络的训练效果,以提高其性能为目标。首先,基于在训练过程中深度神经网络的参数(权重和偏差)按一定规则变化这一观察结果,发现了参数预测在提高训练效率方面的潜力。其次,揭示了通过预测误差引入的噪声注入,参数预测在提高深度神经网络性能方面的潜力。然后,综合考虑各种限制因素,开发了一种深度神经网络参数线性预测方法。最后,在一些有代表性的主干网络上进行了性能和超参数敏感性验证。实验结果表明,与随机梯度下降(SGD)相比,采用所提出的参数线性预测方法,最优模型的准确率提高了约1%,同时top-1/top-5误差降低了约0.01。此外,在所设置的各种超参数下,该方法也表现出稳定的性能,证明了所提方法的有效性,并验证了其在提高网络训练效率和性能方面的能力。