NTT Communication Science Laboratories, NTT Corporation, 3-1 Morinosato Wakamiya, Atsugi, Kanagawa 243-0198, Japan.
Philos Trans R Soc Lond B Biol Sci. 2012 Apr 5;367(1591):977-87. doi: 10.1098/rstb.2011.0370.
Recent studies have shown that auditory scene analysis involves distributed neural sites below, in, and beyond the auditory cortex (AC). However, it remains unclear what role each site plays and how they interact in the formation and selection of auditory percepts. We addressed this issue through perceptual multistability phenomena, namely, spontaneous perceptual switching in auditory streaming (AS) for a sequence of repeated triplet tones, and perceptual changes for a repeated word, known as verbal transformations (VTs). An event-related fMRI analysis revealed brain activity timelocked to perceptual switching in the cerebellum for AS, in frontal areas for VT, and the AC and thalamus for both. The results suggest that motor-based prediction, produced by neural networks outside the auditory system, plays essential roles in the segmentation of acoustic sequences both in AS and VT. The frequency of perceptual switching was determined by a balance between the activation of two sites, which are proposed to be involved in exploring novel perceptual organization and stabilizing current perceptual organization. The effect of the gene polymorphism of catechol-O-methyltransferase (COMT) on individual variations in switching frequency suggests that the balance of exploration and stabilization is modulated by catecholamines such as dopamine and noradrenalin. These mechanisms would support the noteworthy flexibility of auditory scene analysis.
最近的研究表明,听觉场景分析涉及到听觉皮层(AC)下方、内部和之外的分布式神经区域。然而,每个区域的作用以及它们在听觉感知的形成和选择中如何相互作用仍然不清楚。我们通过感知多稳定性现象来解决这个问题,即重复三音序列的听觉流(AS)中的自发感知转换,以及重复单词的感知变化,称为言语转换(VT)。事件相关 fMRI 分析显示,小脑对 AS 的感知转换、额叶区域对 VT 的感知转换以及 AC 和丘脑对两者的感知转换都有时间锁定的大脑活动。研究结果表明,由听觉系统以外的神经网络产生的基于运动的预测在 AS 和 VT 中对声音序列的分割起着至关重要的作用。感知转换的频率取决于两个区域的激活之间的平衡,这两个区域被认为参与探索新的感知组织和稳定当前的感知组织。儿茶酚-O-甲基转移酶(COMT)基因多态性对个体切换频率的影响表明,探索和稳定之间的平衡受到多巴胺和去甲肾上腺素等儿茶酚胺的调节。这些机制将支持听觉场景分析的显著灵活性。