Grimson W E
Philos Trans R Soc Lond B Biol Sci. 1981 May 12;292(1058):217-53. doi: 10.1098/rstb.1981.0031.
Recently, Marr & Poggio (1979) presented a theory of human stereo vision. An implementation of that theory is presented, and consists of five steps. (i) The left and right images are each filtered with masks of four sizes that increase with eccentricity; the shape of these masks is given by delta 2G, the Laplacian of a Gaussian function. (ii) Zero crossings in the filtered images are found along horizontal scan lines. (iii) For each mask size, matching takes place between zero crossings of the same sign and roughly the same orientation in the two images, for a range of disparities up to about the width of the mask's central region. Within this disparity range, it can be shown that false targets pose only a simple problem. (iv) The output of the wide masks can control vergence movements, thus causing small masks to come into correspondence. In this way, the matching process gradually moves from dealing with large disparities at a low resolution to dealing with small disparities at a high resolution. (v) When a correspondence is achieved, it is stored in a dynamic buffer, called the 2 1/2-dimensional sketch. To support the adequacy of the Marr-Poggio model of human stereo vision, the implementation was tested on a wide range of stereograms from the human stereopsis literature. The performance of the implementation is illustrated and compared with human perception. Also statistical assumptions made by Marr & Poggio are supported by comparison with statistics found in practice. Finally, the process of implementing the theory has led to the clarification and refinement of a number of details within the theory; these are discussed in detail.
最近,马尔和波吉奥(1979年)提出了一种人类立体视觉理论。本文介绍了该理论的一种实现方法,它由五个步骤组成。(i)左右图像分别用四种尺寸的掩码进行滤波,这些尺寸随着离中心的距离增加而增大;这些掩码的形状由高斯函数的拉普拉斯算子δ²G给出。(ii)在滤波后的图像中沿着水平扫描线找到零交叉点。(iii)对于每个掩码尺寸,在两个图像中具有相同符号且大致相同方向的零交叉点之间进行匹配,匹配的视差范围可达掩码中心区域宽度左右。在这个视差范围内,可以证明虚假目标只构成一个简单的问题。(iv)宽掩码的输出可以控制辐辏运动,从而使小掩码能够对应起来。通过这种方式,匹配过程逐渐从以低分辨率处理大视差转变为以高分辨率处理小视差。(v)当找到对应关系时,将其存储在一个动态缓冲区中,称为二维半草图。为了支持马尔 - 波吉奥人类立体视觉模型的充分性,该实现方法在人类立体视觉文献中的各种立体图上进行了测试。展示了该实现方法的性能,并与人类感知进行了比较。此外,通过与实际中发现的统计数据比较,支持了马尔和波吉奥所做的统计假设。最后,该理论的实现过程导致了该理论中一些细节的澄清和完善;将对这些细节进行详细讨论。