Helen Wills Neuroscience Institute, University of California Berkeley, Berkeley, California, United States of America.
PLoS Biol. 2012 Jan;10(1):e1001251. doi: 10.1371/journal.pbio.1001251. Epub 2012 Jan 31.
How the human auditory system extracts perceptually relevant acoustic features of speech is unknown. To address this question, we used intracranial recordings from nonprimary auditory cortex in the human superior temporal gyrus to determine what acoustic information in speech sounds can be reconstructed from population neural activity. We found that slow and intermediate temporal fluctuations, such as those corresponding to syllable rate, were accurately reconstructed using a linear model based on the auditory spectrogram. However, reconstruction of fast temporal fluctuations, such as syllable onsets and offsets, required a nonlinear sound representation based on temporal modulation energy. Reconstruction accuracy was highest within the range of spectro-temporal fluctuations that have been found to be critical for speech intelligibility. The decoded speech representations allowed readout and identification of individual words directly from brain activity during single trial sound presentations. These findings reveal neural encoding mechanisms of speech acoustic parameters in higher order human auditory cortex.
人类听觉系统如何提取言语感知相关的声学特征尚不清楚。为了解决这个问题,我们使用人类上颞叶非初级听觉皮层的颅内记录来确定从群体神经活动中可以重建言语声音中的哪些声学信息。我们发现,使用基于听觉声谱图的线性模型可以准确重建缓慢和中频时间波动,例如与音节率相对应的波动。然而,快速时间波动的重建,如音节的起始和结束,需要基于时间调制能量的非线性声音表示。重建精度在被发现对言语可懂度至关重要的声谱一时间波动范围内最高。解码后的言语表示允许在单次试验声音呈现期间直接从大脑活动中读取和识别单个单词。这些发现揭示了人类高级听觉皮层中言语声学参数的神经编码机制。