Beach Sara D, Ozernov-Palchik Ola, May Sidney C, Centanni Tracy M, Gabrieli John D E, Pantazis Dimitrios
McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
Neurobiol Lang (Camb). 2021 May 7;2(2):254-279. doi: 10.1162/nol_a_00034. eCollection 2021 May.
Robust and efficient speech perception relies on the interpretation of acoustically variable phoneme realizations, yet prior neuroimaging studies are inconclusive regarding the degree to which subphonemic detail is maintained over time as categorical representations arise. It is also unknown whether this depends on the demands of the listening task. We addressed these questions by using neural decoding to quantify the (dis)similarity of brain response patterns evoked during two different tasks. We recorded magnetoencephalography (MEG) as adult participants heard isolated, randomized tokens from a /ba/-/da/ speech continuum. In the passive task, their attention was diverted. In the active task, they categorized each token as or . We found that linear classifiers successfully decoded vs. perception from the MEG data. Data from the left hemisphere were sufficient to decode the percept early in the trial, while the right hemisphere was necessary but not sufficient for decoding at later time points. We also decoded stimulus representations and found that they were maintained longer in the active task than in the passive task; however, these representations did not pattern more like discrete phonemes when an active categorical response was required. Instead, in both tasks, early phonemic patterns gave way to a representation of stimulus ambiguity that coincided in time with reliable percept decoding. Our results suggest that the categorization process does not require the loss of subphonemic detail, and that the neural representation of isolated speech sounds includes concurrent phonemic and subphonemic information.
稳健且高效的语音感知依赖于对声学上可变的音素实现的解释,然而,关于随着类别表征的出现,亚音素细节在一段时间内保持的程度,先前的神经影像学研究尚无定论。同样未知的是,这是否取决于听力任务的要求。我们通过使用神经解码来量化在两项不同任务中诱发的大脑反应模式的(不)相似性,从而解决了这些问题。当成年参与者听到来自 /ba/-/da/ 语音连续体的孤立、随机的语音片段时,我们记录了他们的脑磁图(MEG)。在被动任务中,他们的注意力被转移。在主动任务中,他们将每个语音片段分类为 或 。我们发现线性分类器成功地从MEG数据中解码出 与 的感知。来自左半球的数据足以在试验早期解码感知,而右半球在后期时间点对于解码是必要的但不充分。我们还解码了刺激表征,发现它们在主动任务中比在被动任务中保持的时间更长;然而,当需要进行主动类别反应时,这些表征并不更像离散音素那样呈现模式。相反,在两项任务中,早期的音素模式都让位于与可靠感知解码同时出现的刺激模糊性表征。我们的结果表明,分类过程并不需要丢失亚音素细节,并且孤立语音声音的神经表征包括同时存在的音素和亚音素信息。