在生成对抗网络（GAN）生成的场景中，锚定对象推动现实主义，而诊断对象推动分类。

Anchor objects drive realism while diagnostic objects drive categorization in GAN generated scenes.

作者信息

Kallmayer Aylin, Võ Melissa L-H

机构信息

Goethe University Frankfurt, Department of Psychology, Frankfurt am Main, Germany.

出版信息

Commun Psychol. 2024 Jul 26;2(1):68. doi: 10.1038/s44271-024-00119-z.

DOI:10.1038/s44271-024-00119-z

PMID:39242968

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11332195/

Abstract

Our visual surroundings are highly complex. Despite this, we understand and navigate them effortlessly. This requires transforming incoming sensory information into representations that not only span low- to high-level visual features (e.g., edges, object parts, objects), but likely also reflect co-occurrence statistics of objects in real-world scenes. Here, so-called anchor objects are defined as being highly predictive of the location and identity of frequently co-occuring (usually smaller) objects, derived from object clustering statistics in real-world scenes, while so-called diagnostic objects are predictive of the larger semantic context (i.e., scene category). Across two studies (N = 50, N = 44), we investigate which of these properties underlie scene understanding across two dimensions - realism and categorisation - using scenes generated from Generative Adversarial Networks (GANs) which naturally vary along these dimensions. We show that anchor objects and mainly high-level features extracted from a range of pre-trained deep neural networks (DNNs) drove realism both at first glance and after initial processing. Categorisation performance was mainly determined by diagnostic objects, regardless of realism, at first glance and after initial processing. Our results are testament to the visual system's ability to pick up on reliable, category specific sources of information that are flexible towards disturbances across the visual feature-hierarchy.

摘要

我们的视觉环境高度复杂。尽管如此，我们仍能毫不费力地理解并在其中导航。这需要将传入的感官信息转化为不仅涵盖低层次到高层次视觉特征（如边缘、物体部件、物体），而且可能还反映现实世界场景中物体共现统计信息的表征。在此，所谓的锚定物体被定义为能够高度预测经常共同出现（通常较小）物体的位置和身份，它源自现实世界场景中的物体聚类统计信息，而所谓的诊断物体则能预测更大的语义背景（即场景类别）。在两项研究（N = 50，N = 44）中，我们使用由生成对抗网络（GANs）生成的、自然沿这些维度变化的场景，从现实主义和分类这两个维度研究这些属性中哪些是场景理解的基础。我们表明，锚定物体以及从一系列预训练深度神经网络（DNN）中提取的主要高层次特征，在第一眼观察时以及初始处理后都推动了现实主义。分类性能在第一眼观察时以及初始处理后主要由诊断物体决定，而与现实主义无关。我们的结果证明了视觉系统能够捕捉可靠的、特定类别的信息来源，这些信息来源对视觉特征层次结构中的干扰具有灵活性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95fc/11332195/43a45b3fbbd0/44271_2024_119_Fig1_HTML.jpg

相似文献

Anchor objects drive realism while diagnostic objects drive categorization in GAN generated scenes.在生成对抗网络（GAN）生成的场景中，锚定对象推动现实主义，而诊断对象推动分类。

Commun Psychol. 2024 Jul 26;2(1):68. doi: 10.1038/s44271-024-00119-z.

Predictive processing of scenes and objects.场景和物体的预测性处理。

Nat Rev Psychol. 2024 Jan;3:13-26. doi: 10.1038/s44159-023-00254-0. Epub 2023 Nov 23.

You shall know an object by the company it keeps: An investigation of semantic representations derived from object co-occurrence in visual scenes.观其伴，知其物：对视觉场景中物体共现所衍生语义表征的一项研究。

Neuropsychologia. 2015 Sep;76:52-61. doi: 10.1016/j.neuropsychologia.2014.08.031. Epub 2014 Sep 6.

Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments.在解释物体相似性判断方面，深度卷积神经网络的表现优于基于特征的模型，但不优于分类模型。

Front Psychol. 2017 Oct 9;8:1726. doi: 10.3389/fpsyg.2017.01726. eCollection 2017.

The Neural Dynamics of Attentional Selection in Natural Scenes.自然场景中注意选择的神经动力学

J Neurosci. 2016 Oct 12;36(41):10522-10528. doi: 10.1523/JNEUROSCI.1385-16.2016.

Do simultaneously viewed objects influence scene recognition individually or as groups? Two perceptual studies.同时观看的物体是单独还是作为组影响场景识别？两项知觉研究。

PLoS One. 2014 Aug 13;9(8):e102819. doi: 10.1371/journal.pone.0102819. eCollection 2014.

Scene-selective brain regions respond to embedded objects of a scene.场景选择性脑区对场景中的嵌入物体作出反应。

Cereb Cortex. 2023 Apr 25;33(9):5066-5074. doi: 10.1093/cercor/bhac399.

On the Necessity of Recurrent Processing during Object Recognition: It Depends on the Need for Scene Segmentation.论物体识别过程中循环处理的必要性：这取决于场景分割的需求。

J Neurosci. 2021 Jul 21;41(29):6281-6289. doi: 10.1523/JNEUROSCI.2851-20.2021.

Hierarchical organization of objects in scenes is reflected in mental representations of objects.场景中物体的层次组织反映在物体的心理表征中。

Sci Rep. 2022 Nov 23;12(1):20068. doi: 10.1038/s41598-022-24505-x.

A hierarchical probabilistic model for rapid object categorization in natural scenes.一种用于自然场景中快速目标分类的分层概率模型。

PLoS One. 2011;6(5):e20002. doi: 10.1371/journal.pone.0020002. Epub 2011 May 25.

本文引用的文献

The neuroconnectionist research programme.神经连接主义研究计划。

Nat Rev Neurosci. 2023 Jul;24(7):431-450. doi: 10.1038/s41583-023-00705-w. Epub 2023 May 30.

Disentangling diagnostic object properties for human scene categorization.解析人类场景分类的诊断对象属性。

Sci Rep. 2023 Apr 11;13(1):5912. doi: 10.1038/s41598-023-32385-y.

Deep Neural Networks and Visuo-Semantic Models Explain Complementary Components of Human Ventral-Stream Representational Dynamics.深度神经网络和视语义模型解释了人类腹侧流表象动态的互补组成部分。

J Neurosci. 2023 Mar 8;43(10):1731-1741. doi: 10.1523/JNEUROSCI.1424-22.2022. Epub 2023 Feb 9.

Deep problems with neural network models of human vision.人类视觉神经网络模型的深层问题。

Behav Brain Sci. 2022 Dec 1;46:e385. doi: 10.1017/S0140525X22002813.

Hierarchical organization of objects in scenes is reflected in mental representations of objects.场景中物体的层次组织反映在物体的心理表征中。

Sci Rep. 2022 Nov 23;12(1):20068. doi: 10.1038/s41598-022-24505-x.

Measuring memory is harder than you think: How to avoid problematic measurement practices in memory research.衡量记忆比你想象的要难：如何避免记忆研究中的有问题的测量实践。

Psychon Bull Rev. 2023 Apr;30(2):421-449. doi: 10.3758/s13423-022-02179-w. Epub 2022 Oct 19.

Auxiliary Scene-Context Information Provided by Anchor Objects Guides Attention and Locomotion in Natural Search Behavior.锚定物体提供的辅助场景上下文信息引导自然搜索行为中的注意力和运动。

Psychol Sci. 2022 Sep;33(9):1463-1476. doi: 10.1177/09567976221091838. Epub 2022 Aug 9.

What makes a scene? Fast scene categorization as a function of global scene information at different resolutions.什么构成一个场景？不同分辨率下的全局场景信息对快速场景分类的影响。

J Exp Psychol Hum Percept Perform. 2022 Aug;48(8):871-888. doi: 10.1037/xhp0001020. Epub 2022 Jun 16.

A self-supervised domain-general learning framework for human ventral stream representation.一种用于人类腹侧流表示的自监督领域泛化学习框架。

Nat Commun. 2022 Jan 25;13(1):491. doi: 10.1038/s41467-022-28091-4.

The forest, the trees, or both? Hierarchy and interactions between gist and object processing during perception of real-world scenes.森林、树木，还是两者皆有？在感知真实场景时，整体与局部加工的层级关系及其相互作用。

Cognition. 2022 Apr;221:104983. doi: 10.1016/j.cognition.2021.104983. Epub 2021 Dec 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在生成对抗网络（GAN）生成的场景中，锚定对象推动现实主义，而诊断对象推动分类。

Anchor objects drive realism while diagnostic objects drive categorization in GAN generated scenes.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献