Suppr超能文献

深度神经网络中的两阶段切换优化策略。

Two-Phase Switching Optimization Strategy in Deep Neural Networks.

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):330-339. doi: 10.1109/TNNLS.2020.3027750. Epub 2022 Jan 5.

Abstract

Optimization in a deep neural network is always challenging due to the vanishing gradient problem and intensive fine-tuning of network hyperparameters. Inspired by multistage decision control systems, the stochastic diagonal approximate greatest descent (SDAGD) algorithm is proposed in this article to seek for optimal learning weights using a two-phase switching optimization strategy. The proposed optimizer controls the relative step length derived based on the long-term optimal trajectory and adopts the diagonal approximated Hessian for efficient weight update. In Phase-I, it computes the greatest step length at the boundary of each local spherical search region and, subsequently, descends rapidly toward the direction of an optimal solution. In Phase-II, it switches to an approximate Newton method automatically once it is closer to the optimal solution to achieve fast convergence. The experiments show that SDAGD produces steeper learning curves and achieves lower misclassification rates compared with other optimization techniques. Implementation of the proposed optimizer to deeper networks is also investigated in this article to study the vanishing gradient problem.

摘要

由于梯度消失问题和网络超参数的精细调整,深度神经网络的优化一直具有挑战性。受多阶段决策控制系统的启发,本文提出了随机对角近似最大下降(SDAGD)算法,使用两阶段切换优化策略来寻找最优学习权重。所提出的优化器基于长期最优轨迹控制所导出的相对步长,并采用对角近似海森矩阵进行有效的权重更新。在第一阶段,它在每个局部球形搜索区域的边界处计算最大步长,然后快速向最优解的方向下降。在第二阶段,一旦接近最优解,它会自动切换到近似牛顿法以实现快速收敛。实验表明,与其他优化技术相比,SDAGD 产生了更陡峭的学习曲线,并实现了更低的错误分类率。本文还研究了将所提出的优化器应用于更深的网络,以研究梯度消失问题。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验