Spratling Michael W
Division of Engineering, King's College, London, UK.
IEEE Trans Pattern Anal Mach Intell. 2005 May;27(5):753-61. doi: 10.1109/TPAMI.2005.105.
In order to perform object recognition, it is necessary to form perceptual representations that are sufficiently specific to distinguish between objects, but that are also sufficiently flexible to generalize across changes in location, rotation, and scale. A standard method for learning perceptual representations that are invariant to viewpoint is to form temporal associations across image sequences showing object transformations. However, this method requires that individual stimuli be presented in isolation and is therefore unlikely to succeed in real-world applications where multiple objects can co-occur in the visual input. This paper proposes a simple modification to the learning method that can overcome this limitation and results in more robust learning of invariant representations.
为了执行目标识别,有必要形成足够具体以区分不同目标,但同时也足够灵活以在位置、旋转和比例变化中进行泛化的感知表征。学习对视角不变的感知表征的一种标准方法是在显示目标变换的图像序列中形成时间关联。然而,这种方法要求单个刺激单独呈现,因此在视觉输入中可能同时出现多个目标的现实世界应用中不太可能成功。本文提出了一种对学习方法的简单修改,它可以克服这一限制,并导致对不变表征进行更稳健的学习。