Millidge Beren, Salvatori Tommaso, Song Yuhang, Lukasiewicz Thomas, Bogacz Rafal
MRC Brain Network Dynamics Unit, University of Oxford, UK.
Department of Computer Science, University of Oxford, UK.
Proc Mach Learn Res. 2022 Jul;162:15561-15583.
A large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possess close links with self-attention in machine learning. In this paper, we propose a general framework for understanding the operation of such memory networks as a sequence of three operations: , , and . We derive all these memory models as instances of our general framework with differing similarity and separation functions. We extend the mathematical framework of Krotov & Hopfield (2020) to express general associative memory models using neural network dynamics with local computation, and derive a general energy function that is a Lyapunov function of the dynamics. Finally, using our framework, we empirically investigate the capacity of using different similarity functions for these associative memory models, beyond the dot product similarity measure, and demonstrate empirically that Euclidean or Manhattan distance similarity metrics perform substantially better in practice on many tasks, enabling a more robust retrieval and higher memory capacity than existing models.
文献中已经提出了大量用于联想记忆的神经网络模型。这些模型包括经典的霍普菲尔德网络(HNs)、稀疏分布式记忆(SDMs),以及最近的现代连续霍普菲尔德网络(MCHNs),后者与机器学习中的自注意力有着紧密联系。在本文中,我们提出了一个通用框架,用于将此类记忆网络的运作理解为三个操作的序列: 、 和 。我们将所有这些记忆模型推导为具有不同相似性和分离函数的通用框架实例。我们扩展了Krotov & Hopfield(2020)的数学框架,以使用具有局部计算的神经网络动力学来表达通用联想记忆模型,并推导了一个作为动力学李雅普诺夫函数的通用能量函数。最后,使用我们的框架,我们通过实证研究了除点积相似性度量之外,为这些联想记忆模型使用不同相似性函数的能力,并通过实证证明,欧几里得或曼哈顿距离相似性度量在许多任务的实际应用中表现得要好得多,与现有模型相比,能够实现更强大的检索和更高的记忆容量。