Department of Information and Computer Science, Aalto University School of Science, Espoo, Uusimaa 02150, Finland.
Neural Comput. 2013 Mar;25(3):805-31. doi: 10.1162/NECO_a_00397. Epub 2012 Nov 13.
Restricted Boltzmann machines (RBMs) are often used as building blocks in greedy learning of deep networks. However, training this simple model can be laborious. Traditional learning algorithms often converge only with the right choice of metaparameters that specify, for example, learning rate scheduling and the scale of the initial weights. They are also sensitive to specific data representation. An equivalent RBM can be obtained by flipping some bits and changing the weights and biases accordingly, but traditional learning rules are not invariant to such transformations. Without careful tuning of these training settings, traditional algorithms can easily get stuck or even diverge. In this letter, we present an enhanced gradient that is derived to be invariant to bit-flipping transformations. We experimentally show that the enhanced gradient yields more stable training of RBMs both when used with a fixed learning rate and an adaptive one.
受限玻尔兹曼机(RBM)通常用作深度网络贪婪学习的构建块。但是,训练这个简单的模型可能很麻烦。传统的学习算法通常只有在选择合适的超参数时才会收敛,例如学习率调度和初始权重的规模。它们也对特定的数据表示敏感。通过翻转一些位并相应地更改权重和偏差,可以获得等效的 RBM,但传统的学习规则对此类变换并不不变。如果不仔细调整这些训练设置,传统算法很容易陷入困境甚至发散。在这封信中,我们提出了一种增强的梯度,该梯度被推导出对位翻转变换是不变的。我们通过实验表明,增强的梯度在使用固定学习率和自适应学习率时都能更稳定地训练 RBM。
Neural Comput. 2012-11-13
Neural Comput. 2008-6
Cogn Sci. 2014-8
Neural Comput. 2009-11
Neural Comput. 2011-3
Neural Netw. 2018-11-3
Neural Netw. 2014-9-28
Entropy (Basel). 2021-12-13
PLoS Comput Biol. 2019-1-22
Comput Intell Neurosci. 2018-2-1