Suppr超能文献

代数听觉结构的检测通过自监督学习得以实现。

The detection of algebraic auditory structures emerges with self-supervised learning.

作者信息

Orhan Pierre, Boubenec Yves, King Jean-Rémi

机构信息

Laboratoire des Systèmes Perceptifs, Département d'études Cognitives, École Normale Supérieure, PSL University, CNRS, Paris, France.

Meta, Paris, France.

出版信息

PLoS Comput Biol. 2025 Sep 5;21(9):e1013271. doi: 10.1371/journal.pcbi.1013271. eCollection 2025 Sep.

Abstract

Humans can spontaneously detect complex algebraic structures. Historically, two opposing views explain this ability, at the root of language and music acquisition. Some argue for the existence of an innate and specific mechanism. Others argue that this ability emerges from experience: i.e. when generic learning principles continuously process sensory inputs. These two views, however, remain difficult to test experimentally. Here, we use deep learning models to evaluate the factors that lead to the spontaneous detection of algebraic structures in the auditory modality. Specifically, we use self-supervised learning to train multiple deep-learning models with a variable amount of either natural (environmental sounds) and/or cultural sounds (speech or music) to evaluate the impact of these stimuli. We then expose these models to the experimental paradigms classically used to evaluate the processing of algebraic structures. Like humans, these models spontaneously detect repeated sequences, probabilistic chunks, and complex algebraic structures. Also like humans, this ability diminishes with structure complexity. Importantly, this ability can emerge from experience alone: the more the models are exposed to natural sounds, the more they spontaneously detect increasingly complex structures. Finally, this ability does not emerge in models pretrained only on speech, and emerges more rapidly in models pretrained with music than environmental sounds. Overall, our study provides an operational framework to clarify sufficient built-in and acquired principles that model human's advanced capacity to detect algebraic structures in sounds.

摘要

人类能够自发地检测复杂的代数结构。从历史上看,有两种对立的观点解释了这种能力,它是语言和音乐习得的根源。一些人主张存在一种天生的特定机制。另一些人则认为这种能力源于经验,即当通用学习原则持续处理感官输入时。然而,这两种观点在实验上仍然难以验证。在这里,我们使用深度学习模型来评估导致在听觉模态中自发检测代数结构的因素。具体而言,我们使用自监督学习来训练多个深度学习模型,这些模型使用不同数量的自然声音(环境声音)和/或文化声音(语音或音乐),以评估这些刺激的影响。然后,我们将这些模型暴露于经典用于评估代数结构处理的实验范式中。与人类一样,这些模型会自发地检测重复序列、概率块和复杂的代数结构。同样与人类一样,这种能力会随着结构复杂性的增加而减弱。重要的是,这种能力可以仅从经验中产生:模型接触自然声音的次数越多,它们就越能自发地检测出越来越复杂的结构。最后,这种能力在仅用语音进行预训练的模型中不会出现,并且在使用音乐而不是环境声音进行预训练的模型中出现得更快。总体而言,我们的研究提供了一个操作框架,以阐明足够的内在和习得原则,这些原则可以模拟人类在声音中检测代数结构的高级能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7db/12431648/99d3ba695fc9/pcbi.1013271.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验