计算完形与感知阈值

Computational gestalts and perception thresholds.

作者信息

Desolneux Agnès, Moisan Lionel, Morel Jean-Michel

机构信息

CMLA, ENS Cachan, 61 av. du président Wilson, 94235 Cachan cedex, France.

出版信息

J Physiol Paris. 2003 Mar-May;97(2-3):311-24. doi: 10.1016/j.jphysparis.2003.09.006.

DOI:10.1016/j.jphysparis.2003.09.006

PMID:14766147

Abstract

In 1923, Max Wertheimer proposed a research programme and method in visual perception. He conjectured the existence of a small set of geometric grouping laws governing the perceptual synthesis of phenomenal objects, or "gestalt" from the atomic retina input. In this paper, we review this set of geometric grouping laws, using the works of Metzger, Kanizsa and their schools. In continuation, we explain why the Gestalt theory research programme can be translated into a Computer Vision programme. This translation is not straightforward, since Gestalt theory never addressed two fundamental matters: image sampling and image information measurements. Using these advances, we shall show that gestalt grouping laws can be translated into quantitative laws allowing the automatic computation of gestalts in digital images. From the psychophysical viewpoint, a main issue is raised: the computer vision gestalt detection methods deliver predictable perception thresholds. Thus, we are set in a position where we can build artificial images and check whether some kind of agreement can be found between the computationally predicted thresholds and the psychophysical ones. We describe and discuss two preliminary sets of experiments, where we compared the gestalt detection performance of several subjects with the predictable detection curve. In our opinion, the results of this experimental comparison support the idea of a much more systematic interaction between computational predictions in Computer Vision and psychophysical experiments.

摘要

1923年，马克斯·韦特海默提出了一个关于视觉感知的研究计划和方法。他推测存在一小套几何分组定律，这些定律支配着从视网膜的原子输入中对现象对象或“格式塔”的感知合成。在本文中，我们将借助梅茨格、卡尼萨及其学派的著作来回顾这一套几何分组定律。接下来，我们将解释为什么格式塔理论研究计划可以转化为计算机视觉计划。这种转化并非易事，因为格式塔理论从未涉及两个基本问题：图像采样和图像信息测量。利用这些进展，我们将表明格式塔分组定律可以转化为定量定律，从而能够在数字图像中自动计算格式塔。从心理物理学的角度来看，会引发一个主要问题：计算机视觉格式塔检测方法会给出可预测的感知阈值。因此，我们处于这样一种境地，即可以构建人工图像，并检查在计算预测的阈值与心理物理学阈值之间是否能找到某种一致性。我们描述并讨论了两组初步实验，在实验中我们将几个受试者的格式塔检测性能与可预测的检测曲线进行了比较。我们认为，这种实验比较的结果支持了计算机视觉中的计算预测与心理物理学实验之间存在更系统的相互作用这一观点。