Suppr超能文献

变压器和皮层波:在时间上引入上下文的编码器。

Transformers and cortical waves: encoders for pulling in context across time.

机构信息

Department of Mathematics, Western University, London, Ontario, Canada; Fields Laboratory for Network Science, Fields Institute, Toronto, Ontario, Canada.

Department of Philosophy, University of California at San Diego, San Diego, CA, USA.

出版信息

Trends Neurosci. 2024 Oct;47(10):788-802. doi: 10.1016/j.tins.2024.08.006. Epub 2024 Sep 27.

Abstract

The capabilities of transformer networks such as ChatGPT and other large language models (LLMs) have captured the world's attention. The crucial computational mechanism underlying their performance relies on transforming a complete input sequence - for example, all the words in a sentence - into a long 'encoding vector' that allows transformers to learn long-range temporal dependencies in naturalistic sequences. Specifically, 'self-attention' applied to this encoding vector enhances temporal context in transformers by computing associations between pairs of words in the input sequence. We suggest that waves of neural activity traveling across single cortical areas, or multiple regions on the whole-brain scale, could implement a similar encoding principle. By encapsulating recent input history into a single spatial pattern at each moment in time, cortical waves may enable a temporal context to be extracted from sequences of sensory inputs, the same computational principle as that used in transformers.

摘要

Transformer 网络(如 ChatGPT 和其他大型语言模型 (LLMs))的能力引起了全世界的关注。它们性能的关键计算机制依赖于将完整的输入序列(例如,句子中的所有单词)转换成长的“编码向量”,使转换器能够学习自然序列中的长程时间依赖性。具体来说,应用于该编码向量的“自注意力”通过计算输入序列中单词对之间的关联来增强转换器中的时间上下文。我们认为,在单个皮质区域或整个大脑尺度上的多个区域上传播的神经活动波可以实现类似的编码原理。通过在每个时间点将最近的输入历史封装到单个空间模式中,皮质波可以从感觉输入序列中提取时间上下文,这与转换器中使用的计算原理相同。

相似文献

10

本文引用的文献

2
Predictive sequence learning in the hippocampal formation.海马结构中的预测序列学习。
Neuron. 2024 Aug 7;112(15):2645-2658.e4. doi: 10.1016/j.neuron.2024.05.024. Epub 2024 Jun 24.
5
Active oscillations in microscale navigation.微尺度导航中的主动振荡。
Anim Cogn. 2023 Nov;26(6):1837-1850. doi: 10.1007/s10071-023-01819-5. Epub 2023 Sep 4.
9
Large Language Models and the Reverse Turing Test.大语言模型与反向图灵测试。
Neural Comput. 2023 Feb 17;35(3):309-342. doi: 10.1162/neco_a_01563.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验