Suppr超能文献

深度网络在不变目标识别中可模拟人类前馈视觉。

Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition.

机构信息

Department of Computer Science, School of Mathematics, Statistics, and Computer Science, University of Tehran, Tehran, Iran.

CERCO UMR 5549, CNRS - Université de Toulouse, F-31300, France.

出版信息

Sci Rep. 2016 Sep 7;6:32672. doi: 10.1038/srep32672.

Abstract

Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations.

摘要

深度卷积神经网络(DCNN)最近受到了广泛关注,它们在自然图像数据库中能够识别数千种物体类别。其结构与人类视觉系统有些相似:两者都使用受限的感受野,并通过分层结构逐步提取越来越抽象的特征。然而,目前尚不清楚 DCNN 在视图不变性物体识别任务中的表现是否与人类相当,它们在该任务中是否会犯类似的错误并使用类似的表示,以及答案是否取决于视角变化的大小。为了研究这些问题,我们对八个最先进的 DCNN、HMAX 模型和一个基线浅层模型进行了基准测试,并将它们的结果与使用后向掩蔽的人类进行了比较。与之前所有的 DCNN 研究不同,我们仔细控制了视角变化的大小,以证明在变化较弱时,浅层网络可以胜过深层网络和人类。然而,当面临更大的变化时,需要更多的层才能匹配人类的表现和错误分布,并具有与人类行为一致的表示。一个具有 18 层的非常深的网络甚至在最高变化水平上超过了人类,使用了最像人类的表示。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/93a3/5013454/c9690ccc6645/srep32672-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验