Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, United States.
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, United States.
Elife. 2022 May 30;11:e71736. doi: 10.7554/eLife.71736.
Successful engagement with the world requires the ability to predict what will happen next. Here, we investigate how the brain makes a fundamental prediction about the physical world: whether the situation in front of us is stable, and hence likely to stay the same, or unstable, and hence likely to change in the immediate future. Specifically, we ask if judgments of stability can be supported by the kinds of representations that have proven to be highly effective at visual object recognition in both machines and brains, or instead if the ability to determine the physical stability of natural scenes may require generative algorithms that simulate the physics of the world. To find out, we measured responses in both convolutional neural networks (CNNs) and the brain (using fMRI) to natural images of physically stable versus unstable scenarios. We find no evidence for generalizable representations of physical stability in either standard CNNs trained on visual object and scene classification (ImageNet), or in the human ventral visual pathway, which has long been implicated in the same process. However, in frontoparietal regions previously implicated in intuitive physical reasoning we find both scenario-invariant representations of physical stability, and higher univariate responses to unstable than stable scenes. These results demonstrate abstract representations of physical stability in the dorsal but not ventral pathway, consistent with the hypothesis that the computations underlying stability entail not just pattern classification but forward physical simulation.
成功地与世界互动需要预测未来的能力。在这里,我们研究大脑如何对物理世界做出基本预测:我们面前的情况是稳定的,因此很可能保持不变,还是不稳定的,因此很可能在不久的将来发生变化。具体来说,我们想问一下,是否可以通过在机器和大脑中都被证明在视觉对象识别方面非常有效的那种表示来支持稳定性判断,或者确定自然场景的物理稳定性的能力是否可能需要模拟世界物理的生成算法。为了找出答案,我们测量了卷积神经网络 (CNN) 和大脑 (使用 fMRI) 对物理稳定与不稳定场景的自然图像的反应。我们在标准的基于视觉对象和场景分类 (ImageNet) 训练的 CNN 中,或在长期以来一直与同一过程相关的人类腹侧视觉通路上,都没有发现物理稳定性的可推广表示的证据。然而,在前顶叶区域,我们发现了物理稳定性的场景不变表示,以及对不稳定场景的单变量反应高于稳定场景,这些区域先前被认为与直观的物理推理有关。这些结果表明,在背侧通路中存在物理稳定性的抽象表示,而在腹侧通路中则没有,这与以下假设一致:稳定性计算不仅需要模式分类,还需要向前物理模拟。