Suppr超能文献

这种计算理论解决的是正确的问题吗?马尔、吉布森与视觉的目标。

Does this computational theory solve the right problem? Marr, Gibson, and the goal of vision.

作者信息

Warren William H

机构信息

Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA.

出版信息

Perception. 2012;41(9):1053-60. doi: 10.1068/p7327.

Abstract

David Marr's book Vision attempted to formulate athoroughgoing formal theory of perception. Marr borrowed much of the "computational" level from James Gibson: a proper understanding of the goal of vision, the natural constraints, and the available information are prerequisite to describing the processes and mechanisms by which the goal is achieved. Yet, as a research program leading to a computational model of human vision, Marr's program did not succeed. This article asks why, using the perception of 3D shape as a morality tale. Marr presumed that the goal of vision is to recover a general-purpose Euclidean description of the world, which can be deployed for any task or action. On this formulation, vision is underdetermined by information, which in turn necessitates auxiliary assumptions to solve the problem. But Marr's assumptions did not actually reflect natural constraints, and consequently the solutions were not robust. We now know that humans do not in fact recover Euclidean structure--rather, they reliably perceive qualitative shape (hills, dales, courses, ridges), which is specified by the second-order differential structure of images. By recasting the goals of vision in terms of our perceptual competencies, and doing the hard work of analyzing the information available under ecological constraints, we can reformulate the problem so that perception is determined by information and prior knowledge is unnecessary.

摘要

大卫·马尔的著作《视觉》试图构建一种全面的形式化感知理论。马尔从詹姆斯·吉布森那里借鉴了很多“计算”层面的内容:对视觉目标、自然约束以及可用信息的恰当理解,是描述实现该目标的过程和机制的先决条件。然而,作为一个通向人类视觉计算模型的研究项目,马尔的项目并未成功。本文以对三维形状的感知作为一个警示故事,探讨其原因。马尔假定视觉的目标是恢复对世界的通用欧几里得描述,这种描述可用于任何任务或行动。按照这种表述,视觉由信息所欠定,这反过来就需要辅助假设来解决问题。但马尔的假设实际上并未反映自然约束,因此其解决方案并不稳健。我们现在知道,人类实际上并非恢复欧几里得结构——相反,他们能可靠地感知定性形状(山丘、山谷、路线、山脊),而这种形状由图像的二阶微分结构所规定。通过根据我们的感知能力重新设定视觉目标,并努力分析在生态约束下可用的信息,我们可以重新表述这个问题,从而使感知由信息决定,且无需先验知识。

引用本文的文献

2
New Approaches to 3D Vision.三维视觉的新方法。
Philos Trans R Soc Lond B Biol Sci. 2023 Jan 30;378(1869):20210443. doi: 10.1098/rstb.2021.0443. Epub 2022 Dec 13.
4
Biologically Inspired Model for Inference of 3D Shape from Texture.基于生物学启发的从纹理推断三维形状的模型。
PLoS One. 2016 Sep 20;11(9):e0160868. doi: 10.1371/journal.pone.0160868. eCollection 2016.

本文引用的文献

1
A demonstration of 'broken' visual space.“破碎”视觉空间的演示。
PLoS One. 2012;7(3):e33782. doi: 10.1371/journal.pone.0033782. Epub 2012 Mar 29.
2
Fechner, information, and shape perception.费希纳、信息与形状感知。
Atten Percept Psychophys. 2011 Nov;73(8):2353-78. doi: 10.3758/s13414-011-0197-4.
6
Specular reflections and the perception of shape.镜面反射与形状感知。
J Vis. 2004 Sep 23;4(9):798-820. doi: 10.1167/4.9.10.
7
The visual perception of 3D shape.三维形状的视觉感知。
Trends Cogn Sci. 2004 Mar;8(3):115-21. doi: 10.1016/j.tics.2004.01.006.
8
Perceived size and distance in visual space.视觉空间中的感知大小和距离。
Psychol Rev. 1951 Nov;58(6):460-82. doi: 10.1037/h0061505.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验