Department of Mechanical Engineering, McMaster University, Hamilton, Canada.
Neural Netw. 2018 Dec;108:509-526. doi: 10.1016/j.neunet.2018.09.012. Epub 2018 Oct 3.
Deep-Learning has become a leading strategy for artificial intelligence and is being applied in many fields due to its excellent performance that has surpassed human cognitive abilities in a number of classification and control problems (Ciregan, Meier, & Schmidhuber, 2012; Mnih et al., 2015). However, the training process of Deep-Learning is usually slow and requires high-performance computing, capable of handling large datasets. The optimization of the training method can improve the learning rate of the Deep-Learning networks and result in a higher performance while using the same number of training epochs (cycles). This paper considers the use of estimation theory for training of large neural networks and in particular Deep-Learning networks. Two estimation strategies namely the Extended Kalman Filter (EKF) and the Smooth Variable Structure Filter (SVSF) have been revised (subsequently referred to as RSVSF and REKF) and used for network training. They are applied to several benchmark datasets and comparatively evaluated.
深度学习已成为人工智能的主要策略,由于其在许多分类和控制问题上的出色表现,已经超越了人类认知能力,因此在许多领域得到了应用(Ciregan、Meier 和 Schmidhuber,2012;Mnih 等人,2015)。然而,深度学习的训练过程通常较慢,需要高性能计算,能够处理大型数据集。优化训练方法可以提高深度学习网络的学习速度,在使用相同数量的训练周期(循环)时获得更高的性能。本文考虑使用估计理论来训练大型神经网络,特别是深度学习网络。两种估计策略,即扩展卡尔曼滤波器(EKF)和光滑变结构滤波器(SVSF),已经被修订(随后称为 RSVSF 和 REKF)并用于网络训练。它们被应用于几个基准数据集,并进行了比较评估。