Suppr超能文献

在生成对抗网络(GAN)生成的场景中,锚定对象推动现实主义,而诊断对象推动分类。

Anchor objects drive realism while diagnostic objects drive categorization in GAN generated scenes.

作者信息

Kallmayer Aylin, Võ Melissa L-H

机构信息

Goethe University Frankfurt, Department of Psychology, Frankfurt am Main, Germany.

出版信息

Commun Psychol. 2024 Jul 26;2(1):68. doi: 10.1038/s44271-024-00119-z.

Abstract

Our visual surroundings are highly complex. Despite this, we understand and navigate them effortlessly. This requires transforming incoming sensory information into representations that not only span low- to high-level visual features (e.g., edges, object parts, objects), but likely also reflect co-occurrence statistics of objects in real-world scenes. Here, so-called anchor objects are defined as being highly predictive of the location and identity of frequently co-occuring (usually smaller) objects, derived from object clustering statistics in real-world scenes, while so-called diagnostic objects are predictive of the larger semantic context (i.e., scene category). Across two studies (N = 50, N = 44), we investigate which of these properties underlie scene understanding across two dimensions - realism and categorisation - using scenes generated from Generative Adversarial Networks (GANs) which naturally vary along these dimensions. We show that anchor objects and mainly high-level features extracted from a range of pre-trained deep neural networks (DNNs) drove realism both at first glance and after initial processing. Categorisation performance was mainly determined by diagnostic objects, regardless of realism, at first glance and after initial processing. Our results are testament to the visual system's ability to pick up on reliable, category specific sources of information that are flexible towards disturbances across the visual feature-hierarchy.

摘要

我们的视觉环境高度复杂。尽管如此,我们仍能毫不费力地理解并在其中导航。这需要将传入的感官信息转化为不仅涵盖低层次到高层次视觉特征(如边缘、物体部件、物体),而且可能还反映现实世界场景中物体共现统计信息的表征。在此,所谓的锚定物体被定义为能够高度预测经常共同出现(通常较小)物体的位置和身份,它源自现实世界场景中的物体聚类统计信息,而所谓的诊断物体则能预测更大的语义背景(即场景类别)。在两项研究(N = 50,N = 44)中,我们使用由生成对抗网络(GANs)生成的、自然沿这些维度变化的场景,从现实主义和分类这两个维度研究这些属性中哪些是场景理解的基础。我们表明,锚定物体以及从一系列预训练深度神经网络(DNN)中提取的主要高层次特征,在第一眼观察时以及初始处理后都推动了现实主义。分类性能在第一眼观察时以及初始处理后主要由诊断物体决定,而与现实主义无关。我们的结果证明了视觉系统能够捕捉可靠的、特定类别的信息来源,这些信息来源对视觉特征层次结构中的干扰具有灵活性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95fc/11332195/43a45b3fbbd0/44271_2024_119_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验