从简单的先天偏见到复杂的视觉概念。

From simple innate biases to complex visual concepts.

机构信息

Department of Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel.

出版信息

Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):18215-20. doi: 10.1073/pnas.1207690109. Epub 2012 Sep 24.

DOI:10.1073/pnas.1207690109

PMID:23012418

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3497814/

Abstract

Early in development, infants learn to solve visual problems that are highly challenging for current computational methods. We present a model that deals with two fundamental problems in which the gap between computational difficulty and infant learning is particularly striking: learning to recognize hands and learning to recognize gaze direction. The model is shown a stream of natural videos and learns without any supervision to detect human hands by appearance and by context, as well as direction of gaze, in complex natural scenes. The algorithm is guided by an empirically motivated innate mechanism--the detection of "mover" events in dynamic images, which are the events of a moving image region causing a stationary region to move or change after contact. Mover events provide an internal teaching signal, which is shown to be more effective than alternative cues and sufficient for the efficient acquisition of hand and gaze representations. The implications go beyond the specific tasks, by showing how domain-specific "proto concepts" can guide the system to acquire meaningful concepts, which are significant to the observer but statistically inconspicuous in the sensory input.

摘要

在早期发展中，婴儿学会解决当前计算方法极具挑战性的视觉问题。我们提出了一个模型，该模型处理两个基本问题，在这些问题中，计算难度和婴儿学习之间的差距尤为明显：学习识别手和学习识别注视方向。该模型接收自然视频流，并在没有任何监督的情况下学习通过外观和上下文检测复杂自然场景中的人手，以及注视方向。该算法由一个经验驱动的内在机制指导——动态图像中“移动者”事件的检测，即图像区域移动导致静止区域在接触后移动或改变的事件。移动者事件提供了内部教学信号，事实证明它比替代线索更有效，并且足以有效地获取手和注视表示。其影响超出了特定任务，表明特定于域的“原型概念”如何引导系统获取对观察者有意义但在感官输入中统计上不明显的有意义概念。

相似文献

From simple innate biases to complex visual concepts.

Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):18215-20. doi: 10.1073/pnas.1207690109. Epub 2012 Sep 24.

A model for discovering 'containment' relations.

Cognition. 2019 Feb;183:67-81. doi: 10.1016/j.cognition.2018.11.001. Epub 2018 Nov 9.

Domain Specificity of Oculomotor Learning after Changes in Sensory Processing.

J Neurosci. 2017 Nov 22;37(47):11469-11484. doi: 10.1523/JNEUROSCI.1208-17.2017. Epub 2017 Oct 20.

Watch the hands: infants can learn to follow gaze by seeing adults manipulate objects.

Dev Sci. 2014 Mar;17(2):270-81. doi: 10.1111/desc.12122. Epub 2014 Jan 4.

Tactile localization biases are modulated by gaze direction.

Exp Brain Res. 2018 Jan;236(1):31-42. doi: 10.1007/s00221-017-5105-2. Epub 2017 Oct 10.

Performance of a Computational Model of the Mammalian Olfactory System

Viewing Complex, Dynamic Scenes "Through the Eyes" of Another Person: The Gaze-Replay Paradigm.

PLoS One. 2015 Aug 7;10(8):e0134347. doi: 10.1371/journal.pone.0134347. eCollection 2015.

How does image noise affect actual and predicted human gaze allocation in assessing image quality?

Vision Res. 2015 Jul;112:11-25. doi: 10.1016/j.visres.2015.03.029. Epub 2015 May 14.

Neural representational geometry underlies few-shot concept learning.

Proc Natl Acad Sci U S A. 2022 Oct 25;119(43):e2200800119. doi: 10.1073/pnas.2200800119. Epub 2022 Oct 17.

Task-based model/human observer evaluation of SPIHT wavelet compression with human visual system-based quantization.

Acad Radiol. 2005 Mar;12(3):324-36. doi: 10.1016/j.acra.2004.09.015.

引用本文的文献

Causal Perception(s).

Cogn Sci. 2025 Sep;49(9):e70107. doi: 10.1111/cogs.70107.

Expert-level understanding of social scenes requires early visual experience.

iScience. 2025 Apr 16;28(5):112454. doi: 10.1016/j.isci.2025.112454. eCollection 2025 May 16.

Relational visual representations underlie human social interaction recognition.

Nat Commun. 2023 Nov 11;14(1):7317. doi: 10.1038/s41467-023-43156-8.

Seeing social interactions.

Trends Cogn Sci. 2023 Dec;27(12):1165-1179. doi: 10.1016/j.tics.2023.09.001. Epub 2023 Oct 5.

Novel objects with causal event schemas elicit selective responses in tool- and hand-selective lateral occipitotemporal cortex.

Cereb Cortex. 2023 Apr 25;33(9):5557-5573. doi: 10.1093/cercor/bhac442.

Head turning is an effective cue for gaze following: Evidence from newly sighted individuals, school children and adults.

Neuropsychologia. 2022 Sep 9;174:108330. doi: 10.1016/j.neuropsychologia.2022.108330. Epub 2022 Jul 14.

Face identity coding in the deep neural network and primate brain.

Commun Biol. 2022 Jun 20;5(1):611. doi: 10.1038/s42003-022-03557-9.

A Schema-Based Robot Controller Complying With the Constraints of Biological Systems.

Front Neurorobot. 2022 May 9;16:836767. doi: 10.3389/fnbot.2022.836767. eCollection 2022.

Gaze following requires early visual experience.

Proc Natl Acad Sci U S A. 2022 May 17;119(20):e2117184119. doi: 10.1073/pnas.2117184119. Epub 2022 May 12.

Face detection in untrained deep neural networks.

Nat Commun. 2021 Dec 16;12(1):7328. doi: 10.1038/s41467-021-27606-9.

本文引用的文献

Measuring the Development of Social Attention Using Free-Viewing.

Infancy. 2012 Jul;17(4):355-375. doi: 10.1111/j.1532-7078.2011.00086.x. Epub 2011 Jul 28.

Neural theory for the perception of causal actions.

Psychol Res. 2012 Jul;76(4):476-93. doi: 10.1007/s00426-012-0437-9. Epub 2012 Apr 26.

Measuring the objectness of image windows.

IEEE Trans Pattern Anal Mach Intell. 2012 Nov;34(11):2189-202. doi: 10.1109/TPAMI.2012.28.

How to grow a mind: statistics, structure, and abstraction.

Science. 2011 Mar 11;331(6022):1279-85. doi: 10.1126/science.1192788.

Do young infants respond socially to human hands?

Infant Behav Dev. 2011 Apr;34(2):374-7. doi: 10.1016/j.infbeh.2011.01.004. Epub 2011 Feb 12.

View-based encoding of actions in mirror neurons of area f5 in macaque premotor cortex.

Curr Biol. 2011 Jan 25;21(2):144-8. doi: 10.1016/j.cub.2010.12.022. Epub 2011 Jan 13.

When do infants expect hands to be connected to a person?

J Exp Child Psychol. 2011 Jan;108(1):220-7. doi: 10.1016/j.jecp.2010.08.005. Epub 2010 Sep 22.

Letting structure emerge: connectionist and dynamical systems approaches to cognition.

Trends Cogn Sci. 2010 Aug;14(8):348-56. doi: 10.1016/j.tics.2010.06.002. Epub 2010 Jul 2.

What's in View for Toddlers? Using a Head Camera to Study Visual Experience.

Infancy. 2008 May;13(3):229-248. doi: 10.1080/15250000802004437.

Foundations for a new science of learning.

Science. 2009 Jul 17;325(5938):284-8. doi: 10.1126/science.1175626.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从简单的先天偏见到复杂的视觉概念。

From simple innate biases to complex visual concepts.

机构信息

Department of Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel.

出版信息

Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):18215-20. doi: 10.1073/pnas.1207690109. Epub 2012 Sep 24.

DOI:10.1073/pnas.1207690109

PMID:23012418

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3497814/

Abstract

摘要

从简单的先天偏见到复杂的视觉概念。

From simple innate biases to complex visual concepts.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

从简单的先天偏见到复杂的视觉概念。

From simple innate biases to complex visual concepts.

机构信息

出版信息