Department of Psychology, Johann Wolfgang-Goethe-Universität, Frankfurt, Germany. Electronic address: https://www.scenegrammarlab.com/.
Vision Res. 2021 Apr;181:10-20. doi: 10.1016/j.visres.2020.11.003. Epub 2021 Jan 8.
We live in a rich, three dimensional world with complex arrangements of meaningful objects. For decades, however, theories of visual attention and perception have been based on findings generated from lines and color patches. While these theories have been indispensable for our field, the time has come to move on from this rather impoverished view of the world and (at least try to) get closer to the real thing. After all, our visual environment consists of objects that we not only look at, but constantly interact with. Having incorporated the meaning and structure of scenes, i.e. its "grammar", then allows us to easily understand objects and scenes we have never encountered before. Studying this grammar provides us with the fascinating opportunity to gain new insights into the complex workings of attention, perception, and cognition. In this review, I will discuss how the meaning and the complex, yet predictive structure of real-world scenes influence attention allocation, search, and object identification.
我们生活在一个丰富多彩、充满各种有意义物体的三维世界中。然而,数十年来,视觉注意和感知理论一直基于线条和彩色斑块的研究结果。虽然这些理论对我们的领域至关重要,但现在是时候摆脱这种对世界的相对贫瘠的看法了,(至少要尝试)更接近真实的情况。毕竟,我们的视觉环境由我们不仅观看而且不断与之互动的物体组成。融入场景的意义和结构,即其“语法”,然后使我们能够轻松理解以前从未遇到过的物体和场景。研究这种语法为我们提供了一个引人入胜的机会,可以深入了解注意力、感知和认知的复杂运作。在这篇综述中,我将讨论现实世界场景的意义和复杂但可预测的结构如何影响注意力分配、搜索和物体识别。