Suppr超能文献

源自现代霍普菲尔德网络的玻尔兹曼机家族中的注意力机制

Attention in a Family of Boltzmann Machines Emerging From Modern Hopfield Networks.

作者信息

Ota Toshihiro, Karakida Ryo

机构信息

CyberAgent, Shibuya, Tokyo, 150-6121, Japan.

RIKEN, Wako, Saitama, 351-0198, Japan

出版信息

Neural Comput. 2023 Jun 20:1-18. doi: 10.1162/neco_a_01597.

Abstract

Hopfield networks and Boltzmann machines (BMs) are fundamental energy-based neural network models. Recent studies on modern Hopfield networks have broadened the class of energy functions and led to a unified perspective on general Hopfield networks, including an attention module. In this letter, we consider the BM counterparts of modern Hopfield networks using the associated energy functions and study their salient properties from a trainability perspective. In particular, the energy function corresponding to the attention module naturally introduces a novel BM, which we refer to as the attentional BM (AttnBM). We verify that AttnBM has a tractable likelihood function and gradient for certain special cases and is easy to train. Moreover, we reveal the hidden connections between AttnBM and some single-layer models, namely the gaussian-Bernoulli restricted BM and the denoising autoencoder with softmax units coming from denoising score matching. We also investigate BMs introduced by other energy functions and show that the energy function of dense associative memory models gives BMs belonging to exponential family harmoniums.

摘要

霍普菲尔德网络和玻尔兹曼机(BMs)是基于能量的基本神经网络模型。近期对现代霍普菲尔德网络的研究拓宽了能量函数的类别,并为包括注意力模块在内的通用霍普菲尔德网络带来了统一的视角。在这封信中,我们使用相关的能量函数来考虑现代霍普菲尔德网络的BM对应物,并从可训练性的角度研究它们的显著特性。特别地,与注意力模块对应的能量函数自然地引入了一种新颖的BM,我们将其称为注意力玻尔兹曼机(AttnBM)。我们验证了AttnBM在某些特殊情况下具有易于处理的似然函数和梯度,并且易于训练。此外,我们揭示了AttnBM与一些单层模型之间的隐藏联系,即高斯 - 伯努利受限玻尔兹曼机和来自去噪得分匹配的具有softmax单元的去噪自编码器。我们还研究了由其他能量函数引入的玻尔兹曼机,并表明密集联想记忆模型的能量函数给出了属于指数族和声琴的玻尔兹曼机。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验