Watson A B, Ahumada A J
IEEE Trans Biomed Eng. 1989 Jan;36(1):97-106. doi: 10.1109/10.16453.
Retinal ganglion cells represent the visual image with a spatial code, in which each cell conveys information about a small region in the image. In contrast, cells of primary visual cortex employ a hybrid space-frequency code in which each cell conveys information about a region that is local in space, spatial frequency, and orientation. Despite the presumable importance of this transformation, we lack any comprehensive notion of how it occurs. Here we describe a mathematical model for this transformation. The hexagonal orthogonal-oriented quadrature pyramid (HOP) transform, which operates on a hexagonal input lattice, employs basis functions that are orthogonal, self-similar, and localized in space, spatial frequency, orientation, and phase. The basis functions, which are generated from seven basic types through a recursive process, form an image code of the pyramid type. The seven basis functions, six bandpass and one low-pass, occupy a point and a hexagon of six nearest neighbors on a hexagonal sample lattice. The six bandpass basis functions consist of three with even symmetry, and three with odd symmetry. The three even kernels are rotations of 0, 60, and 120 degrees of a common kernel; likewise for the three odd kernels. At the lowest level, the inputs are image samples. At each higher level, the input lattice is provided by the low-pass coefficients computed at the previous level. At each level, the output is subsampled in such a way as to yield a new hexagonal lattice with a spacing square root 7 larger than the previous level, so that the number of coefficients is reduced by a factor of seven at each level. In the biological model, the input lattice is the retinal ganglion cell array. The resulting scheme provides a compact, efficient code of the image and generates receptive fields that resemble those of the primary visual cortex.
视网膜神经节细胞通过空间编码来呈现视觉图像,其中每个细胞传达图像中一个小区域的信息。相比之下,初级视觉皮层的细胞采用混合空间频率编码,其中每个细胞传达关于空间、空间频率和方向上局部区域的信息。尽管这种转换可能很重要,但我们对其发生方式缺乏任何全面的概念。在这里,我们描述了这种转换的数学模型。六边形正交定向正交金字塔(HOP)变换作用于六边形输入晶格,采用在空间、空间频率、方向和相位上正交、自相似且局部化的基函数。这些基函数通过递归过程从七种基本类型生成,形成金字塔类型的图像编码。这七种基函数,六个带通和一个低通,在六边形采样晶格上占据一个点和六个最近邻的六边形。六个带通基函数由三个具有偶对称性的和三个具有奇对称性的组成。三个偶核是一个公共核旋转0、60和120度得到的;三个奇核也是如此。在最低级别,输入是图像样本。在每个较高级别,输入晶格由上一级计算的低通系数提供。在每个级别,输出以这样一种方式进行下采样,以产生一个间距比上一级大根号7的新六边形晶格,从而使系数数量在每个级别减少七分之一。在生物模型中,输入晶格是视网膜神经节细胞阵列。由此产生的方案提供了一种紧凑、高效的图像编码,并生成类似于初级视觉皮层的感受野。