Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139.
Program in Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard University, Boston, MA 02115.
Proc Natl Acad Sci U S A. 2018 Apr 3;115(14):E3313-E3322. doi: 10.1073/pnas.1801614115. Epub 2018 Mar 21.
The cocktail party problem requires listeners to infer individual sound sources from mixtures of sound. The problem can be solved only by leveraging regularities in natural sound sources, but little is known about how such regularities are internalized. We explored whether listeners learn source "schemas"-the abstract structure shared by different occurrences of the same type of sound source-and use them to infer sources from mixtures. We measured the ability of listeners to segregate mixtures of time-varying sources. In each experiment a subset of trials contained schema-based sources generated from a common template by transformations (transposition and time dilation) that introduced acoustic variation but preserved abstract structure. Across several tasks and classes of sound sources, schema-based sources consistently aided source separation, in some cases producing rapid improvements in performance over the first few exposures to a schema. Learning persisted across blocks that did not contain the learned schema, and listeners were able to learn and use multiple schemas simultaneously. No learning was evident when schema were presented in the task-irrelevant (i.e., distractor) source. However, learning from task-relevant stimuli showed signs of being implicit, in that listeners were no more likely to report that sources recurred in experiments containing schema-based sources than in control experiments containing no schema-based sources. The results implicate a mechanism for rapidly internalizing abstract sound structure, facilitating accurate perceptual organization of sound sources that recur in the environment.
鸡尾酒会问题要求听众从声音混合物中推断出单个声源。这个问题只能通过利用自然声源的规律来解决,但对于这些规律是如何内化的,我们知之甚少。我们探讨了听众是否会学习源“模式”——相同类型声源的不同实例所共有的抽象结构,并利用它们从混合物中推断声源。我们测量了听众从时变声源混合物中进行分离的能力。在每个实验中,一部分试验包含基于模式的声源,这些声源是通过转换(转调与时间拉伸)从一个共同的模板生成的,转换引入了声学变化,但保留了抽象结构。在几个任务和声音源类别中,基于模式的声源始终有助于声源分离,在某些情况下,在接触模式的前几次时,性能就会迅速提高。学习会在不包含学习模式的块中持续存在,并且听众能够同时学习和使用多个模式。当模式在任务不相关(即干扰)的声源中呈现时,没有明显的学习效果。然而,从相关刺激中学习的迹象表明是内隐的,即与包含基于模式的声源的控制实验相比,听众不太可能报告在包含基于模式的声源的实验中声源会再次出现。这些结果暗示了一种快速内化抽象声音结构的机制,有助于准确地组织环境中重复出现的声源。