Department of Linguistics, University of Arizona, Tucson, AZ, USA; Statistics Consulting Laboratory, BIO5 Institute, University of Arizona, Tucson, AZ, USA.
Department of Psychological Sciences, Birkbeck College, University of London, UK; Birkbeck-UCL Center for Neuroimaging, London, UK; Department of Experimental Psychology, University College London, UK.
Neuroimage. 2018 Sep;178:574-582. doi: 10.1016/j.neuroimage.2018.05.072. Epub 2018 May 31.
Speech sounds are encoded by distributed patterns of activity in bilateral superior temporal cortex. However, it is unclear whether speech sounds are topographically represented in cortex, or which acoustic or phonetic dimensions might be spatially mapped. Here, using functional MRI, we investigated the potential spatial representation of vowels, which are largely distinguished from one another by the frequencies of their first and second formants, i.e. peaks in their frequency spectra. This allowed us to generate clear hypotheses about the representation of specific vowels in tonotopic regions of auditory cortex. We scanned participants as they listened to multiple natural tokens of the vowels [ɑ] and [i], which we selected because their first and second formants overlap minimally. Formant-based regions of interest were defined for each vowel based on spectral analysis of the vowel stimuli and independently acquired tonotopic maps for each participant. We found that perception of [ɑ] and [i] yielded differential activation of tonotopic regions corresponding to formants of [ɑ] and [i], such that each vowel was associated with increased signal in tonotopic regions corresponding to its own formants. This pattern was observed in Heschl's gyrus and the superior temporal gyrus, in both hemispheres, and for both the first and second formants. Using linear discriminant analysis of mean signal change in formant-based regions of interest, the identity of untrained vowels was predicted with ∼73% accuracy. Our findings show that cortical encoding of vowels is scaffolded on tonotopy, a fundamental organizing principle of auditory cortex that is not language-specific.
语音是通过双侧颞上皮质中分布的活动模式来编码的。然而,目前尚不清楚语音是否在皮质中具有拓扑表示,或者哪些声学或语音维度可能具有空间映射。在这里,我们使用功能磁共振成像(fMRI)研究了元音的潜在空间表示,元音主要通过其第一和第二共振峰(即频谱中的峰值)的频率来区分。这使我们能够对特定元音在听觉皮层的音调区域中的表示产生清晰的假设。我们扫描了参与者在聆听多个自然元音[ɑ]和[i]时的大脑活动,我们选择这两个元音是因为它们的第一和第二共振峰重叠最小。基于对元音刺激的频谱分析,为每个元音定义了基于共振峰的感兴趣区域(ROI),并为每个参与者独立获取了音调图。我们发现,对[ɑ]和[i]的感知会导致与[ɑ]和[i]的共振峰对应的音调区域的差异激活,使得每个元音都与与其自身共振峰对应的音调区域中的信号增加相关。这种模式在双侧的 Heschl 回和颞上回中均可见,且对于第一和第二共振峰均可见。使用基于共振峰的 ROI 中平均信号变化的线性判别分析,可以以约 73%的准确率预测未训练元音的身份。我们的研究结果表明,元音的皮质编码是基于音调的,音调是听觉皮层的一个基本组织原则,与语言无关。