Suppr超能文献

Stochastic Training of Neural Networks via Successive Convex Approximations.

作者信息

Scardapane Simone, Di Lorenzo Paolo

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):4947-4956. doi: 10.1109/TNNLS.2017.2785361. Epub 2018 Jan 15.

Abstract

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of nonconvex optimization, going under the general name of successive convex approximation techniques. The basic idea is to iteratively replace the original (nonconvex, highly dimensional) learning problem with a sequence of (strongly convex) approximations, which are both accurate and simple to optimize. Different from similar ideas (e.g., quasi-Newton algorithms), the approximations can be constructed using only first-order information of the NN function, in a stochastic fashion, while exploiting the overall structure of the learning problem for a faster convergence. We discuss several use cases, based on different choices for the loss function (e.g., squared loss and cross-entropy loss), and for the regularization of the NN's weights. We experiment on several medium-sized benchmark problems and on a large-scale data set involving simulated physical data. The results show how the algorithm outperforms the state-of-the-art techniques, providing faster convergence to a better minimum. Additionally, we show how the algorithm can be easily parallelized over multiple computational units without hindering its performance. In particular, each computational unit can optimize a tailored surrogate function defined on a randomly assigned subset of the input variables, whose dimension can be selected depending entirely on the available computational power.

摘要

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验