IEEE Trans Pattern Anal Mach Intell. 2020 Nov;42(11):2912-2925. doi: 10.1109/TPAMI.2019.2917685. Epub 2019 May 20.
Gradient descent optimization of learning has become a paradigm for training deep convolutional neural networks (DCNN). However, utilizing other learning strategies in the training process of the DCNN has rarely been explored by the deep learning (DL) community. This serves as the motivation to introduce a non-iterative learning strategy to retrain neurons at the top dense or fully connected (FC) layers of DCNN, resulting in, higher performance. The proposed method exploits the Moore-Penrose Inverse to pull back the current residual error to each FC layer, generating well-generalized features. Further, the weights of each FC layers are recomputed according to the Moore-Penrose Inverse. We evaluate the proposed approach on six most widely accepted object recognition benchmark datasets: Scene-15, CIFAR-10, CIFAR-100, SUN-397, Places365, and ImageNet. The experimental results show that the proposed method obtains improvements over 30 state-of-the-art methods. Interestingly, it also indicates that any DCNN with the proposed method can provide better performance than the same network with its original Backpropagation (BP)-based training.
梯度下降优化学习已经成为训练深度卷积神经网络(DCNN)的范例。然而,深度学习(DL)社区很少探索在 DCNN 的训练过程中利用其他学习策略。这就是引入一种非迭代学习策略的动机,以重新训练 DCNN 的顶部密集或完全连接(FC)层的神经元,从而获得更高的性能。该方法利用 Moore-Penrose 逆推将当前残差拉回到每个 FC 层,生成具有良好泛化能力的特征。此外,根据 Moore-Penrose 逆推重新计算每个 FC 层的权重。我们在六个最广泛接受的对象识别基准数据集上评估了所提出的方法:Scene-15、CIFAR-10、CIFAR-100、SUN-397、Places365 和 ImageNet。实验结果表明,所提出的方法优于 30 种最先进的方法。有趣的是,它还表明,任何使用所提出的方法的 DCNN 都可以提供比使用其原始基于反向传播(BP)训练的相同网络更好的性能。