Suppr超能文献

为什么现实世界中的视觉物体识别很难?

Why is real-world visual object recognition hard?

作者信息

Pinto Nicolas, Cox David D, DiCarlo James J

机构信息

McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America.

出版信息

PLoS Comput Biol. 2008 Jan;4(1):e27. doi: 10.1371/journal.pcbi.0040027.

Abstract

Progress in understanding the brain mechanisms underlying vision requires the construction of computational models that not only emulate the brain's anatomy and physiology, but ultimately match its performance on visual tasks. In recent years, "natural" images have become popular in the study of vision and have been used to show apparently impressive progress in building such models. Here, we challenge the use of uncontrolled "natural" images in guiding that progress. In particular, we show that a simple V1-like model--a neuroscientist's "null" model, which should perform poorly at real-world visual object recognition tasks--outperforms state-of-the-art object recognition systems (biologically inspired and otherwise) on a standard, ostensibly natural image recognition test. As a counterpoint, we designed a "simpler" recognition test to better span the real-world variation in object pose, position, and scale, and we show that this test correctly exposes the inadequacy of the V1-like model. Taken together, these results demonstrate that tests based on uncontrolled natural images can be seriously misleading, potentially guiding progress in the wrong direction. Instead, we reexamine what it means for images to be natural and argue for a renewed focus on the core problem of object recognition--real-world image variation.

摘要

要深入理解视觉背后的大脑机制,需要构建计算模型,这些模型不仅要模拟大脑的解剖结构和生理机能,而且最终要在视觉任务上达到与大脑相当的表现。近年来,“自然”图像在视觉研究中颇受青睐,并被用于展示在构建此类模型方面取得的显著进展。在此,我们对使用未经控制的“自然”图像来推动这一进展提出质疑。具体而言,我们发现一个简单的类似初级视觉皮层(V1)的模型——神经科学家的“零”模型,在现实世界的视觉物体识别任务中本应表现不佳——却在一个标准的、表面上是自然图像识别测试中超越了最先进的物体识别系统(包括受生物启发的和其他类型的)。作为对比,我们设计了一个“更简单”的识别测试,以更好地涵盖物体姿态、位置和比例在现实世界中的变化情况,并且我们证明这个测试能够正确揭示类似V1模型的不足之处。综合来看,这些结果表明基于未经控制的自然图像的测试可能会产生严重误导,有可能将研究进展引向错误的方向。相反,我们重新审视图像具有自然性意味着什么,并主张重新聚焦物体识别的核心问题——现实世界中的图像变化。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cabf/2217583/6e425709ae19/pcbi.0040027.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验