Ota Toshihiro, Karakida Ryo
CyberAgent, Shibuya, Tokyo, 150-6121, Japan.
RIKEN, Wako, Saitama, 351-0198, Japan
Neural Comput. 2023 Jun 20:1-18. doi: 10.1162/neco_a_01597.
Hopfield networks and Boltzmann machines (BMs) are fundamental energy-based neural network models. Recent studies on modern Hopfield networks have broadened the class of energy functions and led to a unified perspective on general Hopfield networks, including an attention module. In this letter, we consider the BM counterparts of modern Hopfield networks using the associated energy functions and study their salient properties from a trainability perspective. In particular, the energy function corresponding to the attention module naturally introduces a novel BM, which we refer to as the attentional BM (AttnBM). We verify that AttnBM has a tractable likelihood function and gradient for certain special cases and is easy to train. Moreover, we reveal the hidden connections between AttnBM and some single-layer models, namely the gaussian-Bernoulli restricted BM and the denoising autoencoder with softmax units coming from denoising score matching. We also investigate BMs introduced by other energy functions and show that the energy function of dense associative memory models gives BMs belonging to exponential family harmoniums.
霍普菲尔德网络和玻尔兹曼机(BMs)是基于能量的基本神经网络模型。近期对现代霍普菲尔德网络的研究拓宽了能量函数的类别,并为包括注意力模块在内的通用霍普菲尔德网络带来了统一的视角。在这封信中,我们使用相关的能量函数来考虑现代霍普菲尔德网络的BM对应物,并从可训练性的角度研究它们的显著特性。特别地,与注意力模块对应的能量函数自然地引入了一种新颖的BM,我们将其称为注意力玻尔兹曼机(AttnBM)。我们验证了AttnBM在某些特殊情况下具有易于处理的似然函数和梯度,并且易于训练。此外,我们揭示了AttnBM与一些单层模型之间的隐藏联系,即高斯 - 伯努利受限玻尔兹曼机和来自去噪得分匹配的具有softmax单元的去噪自编码器。我们还研究了由其他能量函数引入的玻尔兹曼机,并表明密集联想记忆模型的能量函数给出了属于指数族和声琴的玻尔兹曼机。