Laboratoire de Sciences Cognitives et Psycholinguistique, Département d'Etudes Cognitives, ENS, EHESS, CNRS, PSL University, Paris, France; Cognitive Machine Learning Team, INRIA, Paris, France; Meta AI Research, Paris, France.
Laboratoire de Sciences Cognitives et Psycholinguistique, Département d'Etudes Cognitives, ENS, EHESS, CNRS, PSL University, Paris, France; Cognitive Machine Learning Team, INRIA, Paris, France; Laboratoire de linguistique formelle, Université de Paris, CNRS, Paris, France.
Cognition. 2024 Apr;245:105734. doi: 10.1016/j.cognition.2024.105734. Epub 2024 Feb 8.
Infants learn their native language(s) at an amazing speed. Before they even talk, their perception adapts to the language(s) they hear. However, the mechanisms responsible for this perceptual attunement and the circumstances in which it takes place remain unclear. This paper presents the first attempt to study perceptual attunement using ecological child-centered audio data. We show that a simple prediction algorithm exhibits perceptual attunement when applied on unrealistic clean audio-book data, but fails to do so when applied on ecologically-valid child-centered data. In the latter scenario, perceptual attunement only emerges when the prediction mechanism is supplemented with inductive biases that force the algorithm to focus exclusively on speech segments while learning speaker-, pitch-, and room-invariant representations. We argue these biases are plausible given previous research on infants and non-human animals. More generally, we show that what our model learns and how it develops through exposure to speech depends exquisitely on the details of the input signal. By doing so, we illustrate the importance of considering ecologically valid input data when modeling language acquisition.
婴儿以惊人的速度学习他们的母语。在他们说话之前,他们的感知就已经适应了他们所听到的语言。然而,负责这种感知协调的机制以及它发生的情况仍不清楚。本文首次尝试使用生态儿童中心音频数据研究感知协调。我们表明,当应用于不切实际的清洁有声读物数据时,一个简单的预测算法表现出感知协调,但当应用于生态有效的儿童中心数据时,它就无法做到这一点。在后一种情况下,只有当预测机制补充了归纳偏差时,感知协调才会出现,这些偏差迫使算法在学习说话人、音高和房间不变的表示时,只专注于语音片段。我们认为,这些偏差是基于对婴儿和非人类动物的先前研究提出的。更一般地说,我们表明,我们的模型通过接触语音学习和发展的方式取决于输入信号的细节。通过这样做,我们说明了在语言习得建模中考虑生态有效输入数据的重要性。