Ullman S
Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge 02139.
Philos Trans R Soc Lond B Biol Sci. 1992 Sep 29;337(1281):371-8; discussion 379. doi: 10.1098/rstb.1992.0115.
This paper discusses two problems related to three-dimensional object recognition. The first is segmentation and the selection of a candidate object in the image, the second is the recognition of a three-dimensional object from different viewing positions. Regarding segmentation, it is shown how globally salient structures can be extracted from a contour image based on geometrical attributes, including smoothness and contour length. This computation is performed by a parallel network of locally connected neuron-like elements. With respect to the effect of viewing, it is shown how the problem can be overcome by using the linear combinations of a small number of two-dimensional object views. In both problems the emphasis is on methods that are relatively low level in nature. Segmentation is performed using a bottom-up process, driven by the geometry of image contours. Recognition is performed without using explicit three-dimensional models, but by the direct manipulation of two-dimensional images.
本文讨论了与三维物体识别相关的两个问题。第一个是图像中候选物体的分割与选择,第二个是从不同视角对三维物体的识别。关于分割,展示了如何基于包括平滑度和轮廓长度在内的几何属性从轮廓图像中提取全局显著结构。此计算由局部连接的类神经元元素的并行网络执行。关于视角的影响,展示了如何通过使用少量二维物体视图的线性组合来克服该问题。在这两个问题中,重点是本质上相对底层的方法。分割使用由图像轮廓几何驱动的自底向上过程来执行。识别不使用明确的三维模型,而是通过对二维图像的直接操作来进行。