Suppr超能文献

基于行列式点过程注意力机制的网格细胞编码支持分布外泛化。

Determinantal point process attention over grid cell code supports out of distribution generalization.

作者信息

Mondal Shanka Subhra, Frankland Steven, Webb Taylor W, Cohen Jonathan D

机构信息

Department of Electrical and Computer Engineering, Princeton University, Princeton, United States.

Princeton Neuroscience Institute, Princeton University, Princeton, United States.

出版信息

Elife. 2024 Aug 1;12:RP89911. doi: 10.7554/eLife.89911.

Abstract

Deep neural networks have made tremendous gains in emulating human-like intelligence, and have been used increasingly as ways of understanding how the brain may solve the complex computational problems on which this relies. However, these still fall short of, and therefore fail to provide insight into how the brain supports strong forms of generalization of which humans are capable. One such case is out-of-distribution (OOD) generalization - successful performance on test examples that lie outside the distribution of the training set. Here, we identify properties of processing in the brain that may contribute to this ability. We describe a two-part algorithm that draws on specific features of neural computation to achieve OOD generalization, and provide a proof of concept by evaluating performance on two challenging cognitive tasks. First we draw on the fact that the mammalian brain represents metric spaces using grid cell code (e.g., in the entorhinal cortex): abstract representations of relational structure, organized in recurring motifs that cover the representational space. Second, we propose an attentional mechanism that operates over the grid cell code using determinantal point process (DPP), that we call DPP attention (DPP-A) - a transformation that ensures maximum sparseness in the coverage of that space. We show that a loss function that combines standard task-optimized error with DPP-A can exploit the recurring motifs in the grid cell code, and can be integrated with common architectures to achieve strong OOD generalization performance on analogy and arithmetic tasks. This provides both an interpretation of how the grid cell code in the mammalian brain may contribute to generalization performance, and at the same time a potential means for improving such capabilities in artificial neural networks.

摘要

深度神经网络在模拟类人智能方面取得了巨大进展,并越来越多地被用作理解大脑如何解决其所依赖的复杂计算问题的方式。然而,这些仍然不足以,因此也无法深入了解大脑如何支持人类所具备的强大泛化形式。其中一个例子就是分布外(OOD)泛化——在位于训练集分布之外的测试示例上取得成功的表现。在这里,我们识别出大脑中可能有助于这种能力的处理特性。我们描述了一种利用神经计算的特定特征来实现OOD泛化的两部分算法,并通过评估在两项具有挑战性的认知任务上的表现来提供概念验证。首先,我们利用哺乳动物大脑使用网格细胞编码来表示度量空间这一事实(例如,在内嗅皮层):关系结构的抽象表示,以覆盖表示空间的重复模式组织起来。其次,我们提出一种注意力机制,该机制使用行列式点过程(DPP)对网格细胞编码进行操作,我们称之为DPP注意力(DPP-A)——一种确保在该空间覆盖中最大稀疏性的变换。我们表明,一种将标准任务优化误差与DPP-A相结合的损失函数可以利用网格细胞编码中的重复模式,并且可以与常见架构集成,以在类比和算术任务上实现强大的OOD泛化性能。这既为哺乳动物大脑中的网格细胞编码如何有助于泛化性能提供了解释,同时也为提高人工神经网络中的此类能力提供了一种潜在手段。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d68/11293867/022898d79013/elife-89911-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验