Brefczynski-Lewis Julie, Lowitszch Svenja, Parsons Michael, Lemieux Susan, Puce Aina
Center for Advanced Imaging & Department of Radiology, West Virginia University, Morgantown, WV, USA.
Brain Topogr. 2009 May;21(3-4):193-206. doi: 10.1007/s10548-009-0093-6. Epub 2009 Apr 23.
In an everyday social interaction we automatically integrate another's facial movements and vocalizations, be they linguistic or otherwise. This requires audiovisual integration of a continual barrage of sensory input-a phenomenon previously well-studied with human audiovisual speech, but not with non-verbal vocalizations. Using both fMRI and ERPs, we assessed neural activity to viewing and listening to an animated female face producing non-verbal, human vocalizations (i.e. coughing, sneezing) under audio-only (AUD), visual-only (VIS) and audiovisual (AV) stimulus conditions, alternating with Rest (R). Underadditive effects occurred in regions dominant for sensory processing, which showed AV activation greater than the dominant modality alone. Right posterior temporal and parietal regions showed an AV maximum in which AV activation was greater than either modality alone, but not greater than the sum of the unisensory conditions. Other frontal and parietal regions showed Common-activation in which AV activation was the same as one or both unisensory conditions. ERP data showed an early superadditive effect (AV > AUD + VIS, no rest), mid-range underadditive effects for auditory N140 and face-sensitive N170, and late AV maximum and common-activation effects. Based on convergence between fMRI and ERP data, we propose a mechanism where a multisensory stimulus may be signaled or facilitated as early as 60 ms and facilitated in sensory-specific regions by increasing processing speed (at N170) and efficiency (decreasing amplitude in auditory and face-sensitive cortical activation and ERPs). Finally, higher-order processes are also altered, but in a more complex fashion.
在日常社交互动中,我们会自动整合他人的面部动作和发声,无论其是语言性的还是其他形式的。这需要对源源不断的感官输入进行视听整合——这一现象此前在人类视听言语方面已有深入研究,但在非言语发声方面却未被研究过。我们使用功能磁共振成像(fMRI)和事件相关电位(ERP)技术,评估了在仅听觉(AUD)、仅视觉(VIS)和视听(AV)刺激条件下,以及与静息状态(R)交替时,观看和聆听一个动画女性面孔发出非言语的人类发声(即咳嗽、打喷嚏)时的神经活动。在感觉处理占主导的区域出现了次加性效应,即AV激活大于单独的主导模态激活。右侧颞后和顶叶区域显示出AV最大值,其中AV激活大于单独的任何一种模态激活,但不大于单感官条件下激活之和。其他额叶和顶叶区域显示出共同激活,即AV激活与一种或两种单感官条件相同。ERP数据显示出早期超加性效应(AV > AUD + VIS,无静息)、听觉N140和面部敏感N170的中期次加性效应,以及晚期AV最大值和共同激活效应。基于fMRI和ERP数据的趋同,我们提出了一种机制,即在多感官刺激出现后最早60毫秒就可能发出信号或得到促进,并通过提高处理速度(在N170时)和效率(降低听觉和面部敏感皮层激活及ERP的幅度)在感觉特异性区域得到促进。最后,高阶过程也会改变,但方式更为复杂。