Graduate School of Arts and Sciences, The University of Tokyo.
Department of Philosophy, Faculty of Letters, Keio University.
Cogn Sci. 2023 Mar;47(3):e13258. doi: 10.1111/cogs.13258.
There is a widely held view that visual representations (images) do not depict negation, for example, as expressed by the sentence, "the train is not coming." The present study focuses on the real-world visual representations of photographs and comic (manga) illustrations and empirically challenges the question of whether humans and machines, that is, modern deep neural networks, can recognize visual representations as expressing negation. By collecting data on the captions humans gave to images and analyzing the occurrences of negation phrases, we show some evidence that humans recognize certain images as expressing negation. Furthermore, based on this finding, we examined whether or not humans and machines can classify novel images as expressing negation. The humans were able to correctly classify images to some extent, as expected from the analysis of the image captions. On the other hand, the machine learning model of image processing was only able to perform this classification at about the chance level, not at the same level of performance as the human. Based on these results, we discuss what makes humans capable of recognizing negation in visual representations, highlighting the role of the background commonsense knowledge that humans can exploit. Comparing human and machine learning performances suggests new ways to understand human cognitive abilities and to build artificial intelligence systems with more human-like abilities to understand logical concepts.
有一种普遍的观点认为,视觉表现形式(图像)不表达否定,例如,用句子“火车不来了”表达否定。本研究专注于照片和漫画(漫画)插图的真实世界视觉表现形式,并从实证上质疑人类和机器(即现代深度神经网络)是否能够识别表达否定的视觉表现形式。通过收集人类对图像的说明数据,并分析否定短语的出现情况,我们提供了一些证据,证明人类确实可以识别某些图像表达否定。此外,基于这一发现,我们还检验了人类和机器是否可以将新的图像分类为表达否定。人类在一定程度上能够正确地对图像进行分类,这是从图像说明的分析中预期的。另一方面,图像处理的机器学习模型只能在大约随机的水平上执行这种分类,而不能达到人类的表现水平。基于这些结果,我们讨论了是什么使人类能够识别视觉表现形式中的否定,强调了人类可以利用的背景常识知识的作用。比较人类和机器学习的性能为理解人类认知能力以及构建具有更像人类的理解逻辑概念的能力的人工智能系统提供了新的途径。