Tivive Fok Hing Chi, Bouzerdoum Abdesselam
School of Electrical, Computer, and Telecommunications Engineering, University of Wollongong, Wollongong, NSW 2522, Australia.
IEEE Trans Neural Netw. 2005 May;16(3):541-56. doi: 10.1109/TNN.2005.845144.
This article presents some efficient training algorithms, based on first-order, second-order, and conjugate gradient optimization methods, for a class of convolutional neural networks (CoNNs), known as shunting inhibitory convolution neural networks. Furthermore, a new hybrid method is proposed, which is derived from the principles of Quickprop, Rprop, SuperSAB, and least squares (LS). Experimental results show that the new hybrid method can perform as well as the Levenberg-Marquardt (LM) algorithm, but at a much lower computational cost and less memory storage. For comparison sake, the visual pattern recognition task of face/nonface discrimination is chosen as a classification problem to evaluate the performance of the training algorithms. Sixteen training algorithms are implemented for the three different variants of the proposed CoNN architecture: binary-, Toeplitz- and fully connected architectures. All implemented algorithms can train the three network architectures successfully, but their convergence speed vary markedly. In particular, the combination of LS with the new hybrid method and LS with the LM method achieve the best convergence rates in terms of number of training epochs. In addition, the classification accuracies of all three architectures are assessed using ten-fold cross validation. The results show that the binary- and Toeplitz-connected architectures outperform slightly the fully connected architecture: the lowest error rates across all training algorithms are 1.95% for Toeplitz-connected, 2.10% for the binary-connected, and 2.20% for the fully connected network. In general, the modified Broyden-Fletcher-Goldfarb-Shanno (BFGS) methods, the three variants of LM algorithm, and the new hybrid/LS method perform consistently well, achieving error rates of less than 3% averaged across all three architectures.
本文针对一类称为分流抑制卷积神经网络的卷积神经网络(CoNNs),提出了基于一阶、二阶和共轭梯度优化方法的高效训练算法。此外,还提出了一种新的混合方法,该方法源自快速传播(Quickprop)、弹性反向传播(Rprop)、超级自适应步长反向传播(SuperSAB)和最小二乘法(LS)的原理。实验结果表明,新的混合方法性能与Levenberg-Marquardt(LM)算法相当,但计算成本更低,内存存储需求更少。为作比较,选择人脸/非人脸判别视觉模式识别任务作为分类问题来评估训练算法的性能。针对所提出的CoNN架构的三种不同变体:二进制、托普利兹和全连接架构,实现了16种训练算法。所有实现的算法都能成功训练这三种网络架构,但它们的收敛速度差异显著。特别是,最小二乘法与新混合方法的组合以及最小二乘法与LM方法的组合在训练轮次方面达到了最佳收敛率。此外,使用十折交叉验证评估了所有三种架构的分类准确率。结果表明,二进制和托普利兹连接架构略优于全连接架构:在所有训练算法中,托普利兹连接架构的最低错误率为1.95%,二进制连接架构为2.10%,全连接网络为2.20%。总体而言,修正的布罗伊登-弗莱彻-戈德法布-香农(BFGS)方法、LM算法的三种变体以及新的混合/最小二乘法表现始终良好,在所有三种架构上的平均错误率均低于3%。