Hu Yu, Mohsenzadeh Yalda
Western Institute for Neuroscience, Western University, London, ON, Canada.
Vector Institute for Artificial Intelligence, Toronto, ON, Canada.
Commun Biol. 2025 Jan 22;8(1):110. doi: 10.1038/s42003-024-07434-5.
Our brain seamlessly integrates distinct sensory information to form a coherent percept. However, when real-world audiovisual events are perceived, the specific brain regions and timings for processing different levels of information remain less investigated. To address that, we curated naturalistic videos and recorded functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) data when participants viewed videos with accompanying sounds. Our findings reveal early asymmetrical cross-modal interaction, with acoustic information represented in both early visual and auditory regions, while visual information only identified in visual cortices. The visual and auditory features were processed with similar onset but different temporal dynamics. High-level categorical and semantic information emerged in multisensory association areas later in time, indicating late cross-modal integration and its distinct role in converging conceptual information. Comparing neural representations to a two-branch deep neural network model highlighted the necessity of early cross-modal connections to build a biologically plausible model of audiovisual perception. With EEG-fMRI fusion, we provided a spatiotemporally resolved account of neural activity during the processing of naturalistic audiovisual stimuli.
我们的大脑无缝整合不同的感官信息,以形成连贯的感知。然而,当感知现实世界中的视听事件时,处理不同层次信息的特定脑区和时间仍有待深入研究。为了解决这一问题,我们精心制作了自然主义视频,并在参与者观看带有伴音的视频时记录了功能磁共振成像(fMRI)和脑电图(EEG)数据。我们的研究结果揭示了早期的不对称跨模态交互,声学信息在早期视觉和听觉区域均有体现,而视觉信息仅在视觉皮层中被识别。视觉和听觉特征以相似的起始时间但不同的时间动态进行处理。高级类别和语义信息在多感官联合区域中较晚出现,表明晚期跨模态整合及其在融合概念信息中的独特作用。将神经表征与双分支深度神经网络模型进行比较,突出了早期跨模态连接对于构建生物合理的视听感知模型的必要性。通过EEG-fMRI融合,我们提供了自然主义视听刺激处理过程中神经活动的时空解析描述。