Watanabe Eiji, Kitaoka Akiyoshi, Sakamoto Kiwako, Yasugi Masaki, Tanaka Kenta
Laboratory of Neurophysiology, National Institute for Basic Biology, Okazaki, Japan.
Department of Basic Biology, The Graduate University for Advanced Studies (SOKENDAI), Miura, Japan.
Front Psychol. 2018 Mar 15;9:345. doi: 10.3389/fpsyg.2018.00345. eCollection 2018.
The cerebral cortex predicts visual motion to adapt human behavior to surrounding objects moving in real time. Although the underlying mechanisms are still unknown, predictive coding is one of the leading theories. Predictive coding assumes that the brain's internal models (which are acquired through learning) predict the visual world at all times and that errors between the prediction and the actual sensory input further refine the internal models. In the past year, deep neural networks based on predictive coding were reported for a video prediction machine called PredNet. If the theory substantially reproduces the visual information processing of the cerebral cortex, then PredNet can be expected to represent the human visual perception of motion. In this study, PredNet was trained with natural scene videos of the self-motion of the viewer, and the motion prediction ability of the obtained computer model was verified using unlearned videos. We found that the computer model accurately predicted the magnitude and direction of motion of a rotating propeller in unlearned videos. Surprisingly, it also represented the rotational motion for illusion images that were not moving physically, much like human visual perception. While the trained network accurately reproduced the direction of illusory rotation, it did not detect motion components in negative control pictures wherein people do not perceive illusory motion. This research supports the exciting idea that the mechanism assumed by the predictive coding theory is one of basis of motion illusion generation. Using sensory illusions as indicators of human perception, deep neural networks are expected to contribute significantly to the development of brain research.
大脑皮层预测视觉运动,以使人类行为实时适应周围移动的物体。尽管其潜在机制尚不清楚,但预测编码是主要理论之一。预测编码假定大脑的内部模型(通过学习获得)始终预测视觉世界,并且预测与实际感官输入之间的误差会进一步完善内部模型。在过去的一年里,针对一种名为PredNet的视频预测机器,报道了基于预测编码的深度神经网络。如果该理论能实质性地再现大脑皮层的视觉信息处理过程,那么PredNet有望代表人类对运动的视觉感知。在本研究中,PredNet使用观看者自身运动的自然场景视频进行训练,并使用未学习过的视频验证所获得计算机模型的运动预测能力。我们发现,该计算机模型能准确预测未学习视频中旋转螺旋桨的运动幅度和方向。令人惊讶的是,它还能呈现物理上并未移动的错觉图像的旋转运动,这与人类视觉感知非常相似。虽然经过训练的网络准确地再现了错觉旋转的方向,但它在人们不会感知到错觉运动的阴性对照图片中未检测到运动成分。这项研究支持了一个令人兴奋的观点,即预测编码理论所假定的机制是运动错觉产生的基础之一。利用感官错觉作为人类感知的指标,深度神经网络有望为大脑研究的发展做出重大贡献。