Cichy Radoslaw Martin, Teng Santani
Department of Education and Psychology, Free University Berlin, Berlin, Germany
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
Philos Trans R Soc Lond B Biol Sci. 2017 Feb 19;372(1714). doi: 10.1098/rstb.2016.0108. Epub 2017 Jan 2.
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'.
在自然环境中,视觉和听觉刺激能在短短几分之一秒内引发大量脑区的反应,从而生成多模态场景及其属性的表征。视觉和听觉信息处理背后快速而复杂的神经动力学给人类认知神经科学带来了重大挑战。通过非侵入性测量得到的脑信号本身就有噪声,神经表征的形式未知,而且表征之间的转换复杂且往往是非线性的。此外,没有单一的非侵入性脑测量技术能提供时空整合的视角。在这篇观点文章中,我们认为基于近期方法学发展的三大支柱共同努力可以取得进展:(i)诸如解码和交叉分类等灵敏的分析技术,(ii)使用深度神经网络等模型进行复杂的计算建模,以及(iii)跨成像方法(脑磁图/脑电图、功能磁共振成像)和模型进行整合,例如使用表征相似性分析。我们展示了近期本着这种精神开展的两项工作,并给出了关于视觉和听觉场景分析的新结果。最后,我们讨论了这一观点的局限性,并勾勒了未来研究的具体路线图。本文是主题为“听觉和视觉场景分析”的特刊的一部分。