Suppr超能文献

训练单个ReLU神经元时的幅度和角度动态

Magnitude and angle dynamics in training single ReLU neurons.

作者信息

Lee Sangmin, Sim Byeongsu, Ye Jong Chul

机构信息

Department of Mathematical Sciences, KAIST, Daejeon, Republic of Korea.

Kim Jaechul Graduate School of AI, KAIST, Daejeon, Republic of Korea.

出版信息

Neural Netw. 2024 Oct;178:106435. doi: 10.1016/j.neunet.2024.106435. Epub 2024 Jun 22.

Abstract

Understanding the training dynamics of deep ReLU networks is a significant area of interest in deep learning. However, there remains a lack of complete elucidation regarding the weight vector dynamics, even for single ReLU neurons. To bridge this gap, our study delves into the training dynamics of the gradient flow w(t) for single ReLU neurons under the square loss, dissecting it into its magnitude ‖w(t)‖ and angle φ(t) components. Through this decomposition, we establish upper and lower bounds on these components to elucidate the convergence dynamics. Furthermore, we demonstrate the empirical extension of our findings to general two-layer multi-neuron networks. All theoretical results are generalized to the gradient descent method and rigorously verified through experiments.

摘要

理解深度ReLU网络的训练动态是深度学习中一个重要的研究领域。然而,即使对于单个ReLU神经元,关于权重向量动态仍缺乏完整的阐释。为了弥补这一差距,我们的研究深入探讨了在平方损失下单个ReLU神经元的梯度流w(t)的训练动态,将其分解为模长‖w(t)‖和角度φ(t)分量。通过这种分解,我们建立了这些分量的上下界以阐明收敛动态。此外,我们展示了将我们的发现经验性地扩展到一般的两层多神经元网络。所有理论结果都推广到梯度下降法,并通过实验进行了严格验证。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验