de Heer Wendy A, Huth Alexander G, Griffiths Thomas L, Gallant Jack L, Theunissen Frédéric E
Department of Psychology and Helen Wills Neurosciences Institute, University of California, Berkeley, Berkeley, California 94720.
Department of Psychology and Helen Wills Neurosciences Institute, University of California, Berkeley, Berkeley, California 94720
J Neurosci. 2017 Jul 5;37(27):6539-6557. doi: 10.1523/JNEUROSCI.3267-16.2017. Epub 2017 Jun 6.
Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech. To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to natural speech. Both cerebral hemispheres were actively involved in speech processing in large and equal amounts. Also, the transformation from spectral features to semantic elements occurs early in the cortical speech-processing stream. Our experimental and analytical approaches are important alternatives and complements to standard approaches that use segmented speech and block designs, which report more laterality in speech processing and associated semantic processing to higher levels of cortex than reported here.
言语理解要求大脑从耳蜗所呈现的频谱特征中提取语义。为了研究这一过程,我们进行了一项功能磁共振成像(fMRI)实验,让五名男性和两名女性被动聆听数小时的自然叙述性言语。然后,我们使用体素建模,基于代表言语频谱、发音和语义属性的三种不同特征空间来预测血氧水平依赖(BOLD)反应。接着,使用一个单独的验证数据集评估每个特征空间所解释的方差量。由于某些反应可能由多个特征空间同样好地解释,我们使用方差分割分析来确定每个特征空间唯一解释的方差比例。与先前的研究一致,我们发现言语理解涉及从初级听觉区域开始并在颞叶横向移动的分层表征:频谱特征出现在A1的核心区域,频谱和发音混合特征出现在颞上回(STG),发音和语义混合特征出现在颞上沟(STS),而语义特征出现在STS及更靠后的区域。我们的数据还表明,两个半球在言语感知和解释中同等且积极地参与。此外,在听觉层级中像STS这样早期阶段的反应与语义表征的相关性比与频谱表征的相关性更强。这些结果说明了在神经语言学研究中使用自然言语的重要性。我们的方法还提供了一种有效的方式,无需使用组块设计以及分段或合成言语,就能同时测试关于言语表征的多个特定假设。为了研究人类大脑将自然语音声音转化为有意义语言所执行的处理步骤,我们使用基于分层语音特征集的模型来预测在fMRI实验中当受试者聆听自然言语时记录的各个体素的BOLD反应。两个大脑半球都大量且同等程度地积极参与言语处理。而且,从频谱特征到语义元素的转换在皮质言语处理流中很早就发生了。我们的实验和分析方法是对使用分段言语和组块设计的标准方法的重要替代和补充,标准方法报告在言语处理及相关语义处理中向更高皮质水平的更多偏侧化,比这里所报告的情况更明显。