Warren William H
Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA.
Perception. 2012;41(9):1053-60. doi: 10.1068/p7327.
David Marr's book Vision attempted to formulate athoroughgoing formal theory of perception. Marr borrowed much of the "computational" level from James Gibson: a proper understanding of the goal of vision, the natural constraints, and the available information are prerequisite to describing the processes and mechanisms by which the goal is achieved. Yet, as a research program leading to a computational model of human vision, Marr's program did not succeed. This article asks why, using the perception of 3D shape as a morality tale. Marr presumed that the goal of vision is to recover a general-purpose Euclidean description of the world, which can be deployed for any task or action. On this formulation, vision is underdetermined by information, which in turn necessitates auxiliary assumptions to solve the problem. But Marr's assumptions did not actually reflect natural constraints, and consequently the solutions were not robust. We now know that humans do not in fact recover Euclidean structure--rather, they reliably perceive qualitative shape (hills, dales, courses, ridges), which is specified by the second-order differential structure of images. By recasting the goals of vision in terms of our perceptual competencies, and doing the hard work of analyzing the information available under ecological constraints, we can reformulate the problem so that perception is determined by information and prior knowledge is unnecessary.
大卫·马尔的著作《视觉》试图构建一种全面的形式化感知理论。马尔从詹姆斯·吉布森那里借鉴了很多“计算”层面的内容:对视觉目标、自然约束以及可用信息的恰当理解,是描述实现该目标的过程和机制的先决条件。然而,作为一个通向人类视觉计算模型的研究项目,马尔的项目并未成功。本文以对三维形状的感知作为一个警示故事,探讨其原因。马尔假定视觉的目标是恢复对世界的通用欧几里得描述,这种描述可用于任何任务或行动。按照这种表述,视觉由信息所欠定,这反过来就需要辅助假设来解决问题。但马尔的假设实际上并未反映自然约束,因此其解决方案并不稳健。我们现在知道,人类实际上并非恢复欧几里得结构——相反,他们能可靠地感知定性形状(山丘、山谷、路线、山脊),而这种形状由图像的二阶微分结构所规定。通过根据我们的感知能力重新设定视觉目标,并努力分析在生态约束下可用的信息,我们可以重新表述这个问题,从而使感知由信息决定,且无需先验知识。