Reitboeck H J, Altmann J
Biol Cybern. 1984;51(2):113-21. doi: 10.1007/BF00357924.
The mapping of retinal space onto the striate cortex of some mammals can be approximated by a log-polar function. It has been proposed that this mapping is of functional importance for scale- and rotation-invariant pattern recognition in the visual system. An exact log-polar transform converts centered scaling and rotation into translations. A subsequent translation-invariant transform, such as the absolute value of the Fourier transform, thus generates overall size- and rotation-invariance. In our model, the translation-invariance is realized via the R-transform. This transform can be executed by simple neural networks, and it does not require the complex computations of the Fourier transform, used in Mellin-transform size-invariance models. The logarithmic space distortion and differentiation in the first processing stage of the model is realized via "Mexican hat" filters whose diameter increases linearly with eccentricity, similar to the characteristics of the receptive fields of retinal ganglion cells. Except for some special cases, the model can explain object recognition independent of size, orientation and position. Some general problems of Mellin-type size-invariance models-that also apply to our model-are discussed.
某些哺乳动物视网膜空间到纹状皮质的映射可以用对数极坐标函数近似表示。有人提出,这种映射对于视觉系统中尺度和旋转不变的模式识别具有重要的功能意义。精确的对数极坐标变换将中心缩放和旋转转换为平移。随后的平移不变变换,如傅里叶变换的绝对值,从而产生整体大小和旋转不变性。在我们的模型中,平移不变性是通过R变换实现的。这种变换可以由简单的神经网络执行,并且不需要梅林变换大小不变性模型中使用的傅里叶变换的复杂计算。模型第一处理阶段的对数空间扭曲和微分是通过“墨西哥帽”滤波器实现的,其直径随偏心率线性增加,类似于视网膜神经节细胞感受野的特征。除了一些特殊情况外,该模型可以解释与大小、方向和位置无关的物体识别。还讨论了梅林型大小不变性模型的一些普遍问题——这些问题也适用于我们的模型。