Liu Z, Kersten D
NEC Research Institute, Princeton, NJ 08540, USA.
Vision Res. 1998 Aug;38(15-16):2507-19. doi: 10.1016/s0042-6989(98)00063-7.
In human object recognition, converging evidence has shown that subjects' performance depends on their familiarity with an object's appearance. The extent of such dependence is a function of the inter-object similarity. The more similar the objects are, the stronger this dependence will be and the more dominant the two-dimensional (2D) image-based information will be. However, the degree to which three-dimensional (3D) model-based information is used remains an area of strong debate. Previously the authors showed that all models with independent 2D templates that allowed 2D rotations in the image plane cannot account for human performance in discriminating novel object views. Here the authors derive an analytic formulation of a Bayesian model that gives rise to the best possible performance under 2D affine transformations and demonstrate that this model cannot account for human performance in 3D object discrimination. Relative to this model, human statistical efficiency is higher for novel views than for learned views, suggesting that human observers have used some 3D structural information.
在人类目标识别中,越来越多的证据表明,受试者的表现取决于他们对物体外观的熟悉程度。这种依赖程度是物体间相似度的函数。物体越相似,这种依赖就越强,基于二维(2D)图像的信息就越占主导地位。然而,基于三维(3D)模型的信息的使用程度仍是一个激烈争论的领域。此前,作者表明,所有具有独立二维模板且允许在图像平面内进行二维旋转的模型,都无法解释人类在辨别新物体视图时的表现。在此,作者推导了一个贝叶斯模型的解析公式,该模型在二维仿射变换下能产生最佳性能,并证明该模型无法解释人类在三维物体辨别中的表现。相对于该模型,人类对新视图的统计效率高于对已学视图的统计效率,这表明人类观察者使用了一些三维结构信息。