Department of Psychology, Vanderbilt University, Nashville, Tennessee 37240-7817; email:
Department of Psychology, Center for the Neural Basis of Cognition, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213.
Annu Rev Vis Sci. 2016 Oct 14;2:377-396. doi: 10.1146/annurev-vision-111815-114621. Epub 2016 Aug 3.
How do we recognize objects despite changes in their appearance? The past three decades have been witness to intense debates regarding both whether objects are encoded invariantly with respect to viewing conditions and whether specialized, separable mechanisms are used for the recognition of different object categories. We argue that such dichotomous debates ask the wrong question. Much more important is the nature of object representations: What are features that enable invariance or differential processing between categories? Although the nature of object features is still an unanswered question, new methods for connecting data to models show significant potential for helping us to better understand neural codes for objects. Most prominently, new approaches to analyzing data from functional magnetic resonance imaging, including neural decoding and representational similarity analysis, and new computational models of vision, including convolutional neural networks, have enabled a much more nuanced understanding of visual representation. Convolutional neural networks are particularly intriguing as a tool for studying biological vision in that this class of artificial vision systems, based on biologically plausible deep neural networks, exhibits visual recognition capabilities that are approaching those of human observers. As these models improve in their recognition performance, it appears that they also become more effective in predicting and accounting for neural responses in the ventral cortex. Applying these and other deep models to empirical data shows great promise for enabling future progress in the study of visual recognition.
我们如何在物体外观发生变化的情况下识别物体?过去三十年见证了激烈的争论,涉及到物体是否针对观察条件进行不变编码,以及是否使用专门的、可分离的机制来识别不同的物体类别。我们认为,这种二分法的争论问错了问题。更重要的是物体表示的性质:是什么特征使类别之间的不变性或差异处理成为可能?尽管物体特征的性质仍然是一个未解决的问题,但将数据与模型联系起来的新方法显示出了帮助我们更好地理解物体神经编码的巨大潜力。最突出的是,用于分析功能磁共振成像数据的新方法,包括神经解码和表示相似性分析,以及新的视觉计算模型,包括卷积神经网络,使我们对视觉表示有了更细致入微的理解。卷积神经网络作为研究生物视觉的工具尤其引人注目,因为基于生物上合理的深度神经网络的这类人工视觉系统表现出的视觉识别能力正逐渐接近人类观察者的能力。随着这些模型在识别性能上的提高,它们似乎也在预测和解释腹侧皮层的神经反应方面变得更加有效。将这些和其他深度模型应用于经验数据,为未来视觉识别研究的进展带来了巨大的希望。