Suppr超能文献

深度卷积神经网络能否支持相同-不同任务中的关系推理?

Can deep convolutional neural networks support relational reasoning in the same-different task?

机构信息

School of Psychological Science, University of Bristol, UK.

出版信息

J Vis. 2022 Sep 2;22(10):11. doi: 10.1167/jov.22.10.11.

Abstract

Same-different visual reasoning is a basic skill central to abstract combinatorial thought. This fact has lead neural networks researchers to test same-different classification on deep convolutional neural networks (DCNNs), which has resulted in a controversy regarding whether this skill is within the capacity of these models. However, most tests of same-different classification rely on testing on images that come from the same pixel-level distribution as the training images, yielding the results inconclusive. In this study, we tested relational same-different reasoning in DCNNs. In a series of simulations we show that models based on the ResNet architecture are capable of visual same-different classification, but only when the test images are similar to the training images at the pixel level. In contrast, when there is a shift in the testing distribution that does not change the relation between the objects in the image, the performance of DCNNs decreases substantially. This finding is true even when the DCNNs' training regime is expanded to include images taken from a wide range of different pixel-level distributions or when the model is trained on the testing distribution but on a different task in a multitask learning context. Furthermore, we show that the relation network, a deep learning architecture specifically designed to tackle visual relational reasoning problems, suffers the same kind of limitations. Overall, the results of this study suggest that learning same-different relations is beyond the scope of current DCNNs.

摘要

相同-不同视觉推理是抽象组合思维的基本技能。这一事实促使神经网络研究人员在深度卷积神经网络(DCNN)上测试相同-不同分类,这导致了关于这些模型是否具备这种能力的争议。然而,大多数相同-不同分类的测试都依赖于测试与训练图像来自相同像素级分布的图像,导致结果不确定。在这项研究中,我们在 DCNN 中测试了关系相同-不同推理。在一系列模拟中,我们表明基于 ResNet 架构的模型能够进行视觉相同-不同分类,但仅当测试图像在像素级别与训练图像相似时。相比之下,当测试分布发生变化但不改变图像中对象之间的关系时,DCNN 的性能会大幅下降。即使 DCNN 的训练方案扩展到包括来自广泛不同像素级分布的图像,或者模型在多任务学习环境中针对测试分布进行训练但针对不同任务进行训练时,这种发现仍然成立。此外,我们表明,关系网络,一种专门用于解决视觉关系推理问题的深度学习架构,也受到相同类型的限制。总体而言,这项研究的结果表明,学习相同-不同关系超出了当前 DCNN 的范围。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c49e/9482325/d576309b390c/jovi-22-10-11-f001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验