Suppr超能文献

视觉信息的早期处理

Early processing of visual information.

作者信息

Marr D

出版信息

Philos Trans R Soc Lond B Biol Sci. 1976 Oct 19;275(942):483-519. doi: 10.1098/rstb.1976.0090.

Abstract

An introduction is given to a theory of early visual information processing. The theory has been implemented, and examples are given of images at various stages of analysis. It is argued that the first step of consequence is to compute a primitive but rich description of the grey-level changes present in an image. The description is expressed in a vocabulary of kinds of intensity change (EDGE, SHADING-EDGE, EXTENDED-EDGE, LINE, BLOB etc.). Modifying parameters are bound to the elements in the description, specifying their POSITION, ORIENTATION, TERMINATION points, CONTRAST, SIZE and FUZZINESS. This description is obtained from the intensity array by fixed techniques, and it is called the primal sketch. For most images, the primal sketch is large and unwieldy. The second important step in visual information processing is to group its contents in a way that is appropriate for later recognition. From our ability to interpret drawings with little semantic content, one may infer the presence in our perceptual equipment of symbolic processes that can define "place-tokens" in an image in various ways, and can group them according to certain rules. Homomorphic techniques fail to account for many of these grouping phenomena, whose explanations require mechanisms of construction rather than mechanisms of detection. The necessary grouping of elements in the primal sketch may be achieved by a mechanism that has available the processes inferred from above, together with the ability to select items by first order discriminations acting on the elements' parameters. Only occasionally do these mechanisms use downward-flowing information about the contents of the particular image being processed. It is argued that "non-attentive" vision is in practice implemented by these grouping operations and first order discriminations acting on the primal sketch. The class of computations so obtained differs slightly from the class of second order operations on the intensity array. The extraction of a form from the primal sketch using these techniques amounts to the separation of figure from ground. It is concluded that most of the separation can be carried out by using techniques that do not depend upon the particular image in question. Therefore, figure-ground separation can normally precede the description of the shape of the extracted form. Up to this point, higher-level knowledge and purpose are brought to bear on only a few of the decisions taken during the processing. This relegates the widespread use of downward-flowing information to a later stage than is found in current machine-vision programs, and implies that such knowledge should influence the control of, rather than interfering with, the actual data-processing that is taking place lower down.

摘要

本文介绍了一种早期视觉信息处理理论。该理论已得到实现,并给出了分析各个阶段的图像示例。有人认为,首要的第一步是计算图像中灰度变化的原始但丰富的描述。该描述用强度变化类型(边缘、阴影边缘、延伸边缘、线条、斑点等)的词汇来表达。修改参数与描述中的元素相关联,指定它们的位置、方向、端点、对比度、大小和模糊度。这个描述是通过固定技术从强度阵列中获得的,它被称为初始草图。对于大多数图像来说,初始草图庞大且难以处理。视觉信息处理的第二个重要步骤是以适合后续识别的方式对其内容进行分组。从我们解读语义内容很少的图形的能力可以推断,在我们的感知设备中存在符号处理过程,这些过程可以以各种方式在图像中定义“位置标记”,并能根据某些规则对它们进行分组。同态技术无法解释许多这些分组现象,对它们的解释需要构建机制而非检测机制。初始草图中元素的必要分组可以通过一种机制来实现,该机制具备上述推断出的过程,以及通过对元素参数进行一阶判别来选择项目的能力。这些机制只是偶尔使用关于正在处理的特定图像内容的向下流动的信息。有人认为,“非注意力”视觉实际上是通过这些对初始草图进行的分组操作和一阶判别来实现的。这样获得的计算类别与对强度阵列进行的二阶操作类别略有不同。使用这些技术从初始草图中提取形状相当于将图形与背景分离。得出的结论是,大多数分离可以通过不依赖于所讨论的特定图像的技术来进行。因此,图形与背景的分离通常可以先于对提取形状的描述。到目前为止,在处理过程中做出的决策中,只有少数受到了高层知识和目的的影响。这使得向下流动信息的广泛使用比当前机器视觉程序中的阶段更靠后,并且意味着这种知识应该影响控制,而不是干扰更低层正在进行的实际数据处理。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验