通过基于计算高效模型的预测实现基于中央凹的主动视觉。

Active Fovea-Based Vision Through Computationally-Effective Model-Based Prediction.

作者信息

Daucé Emmanuel

机构信息

Ecole Centrale de Marseille, INSERM, Institut de Neurosciences des Systèmes, Aix Marseille Université, Marseille, France.

出版信息

Front Neurorobot. 2018 Dec 14;12:76. doi: 10.3389/fnbot.2018.00076. eCollection 2018.

DOI:10.3389/fnbot.2018.00076

PMID:30618705

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6302111/

Abstract

What motivates an action in the absence of a definite reward? Taking the case of visuomotor control, we consider a minimal control problem that is how select the next saccade, in a sequence of discrete eye movements, when the final objective is to better interpret the current visual scene. The visual scene is modeled here as a partially-observed environment, with a generative model explaining how the visual data is shaped by action. This allows to interpret different action selection metrics proposed in the literature, including the Salience, the Infomax and the Variational Free Energy, under a single information theoretic construct, namely the view-based Information Gain. Pursuing this analytic track, two original action selection metrics named the Information Gain Lower Bound (IGLB) and the Information Gain Upper Bound (IGUB) are then proposed. Showing either a conservative or an optimistic bias regarding the Information Gain, they strongly simplify its calculation. An original fovea-based visual scene decoding setup is then proposed, with numerical experiments highlighting different facets of artificial fovea-based vision. A first and principal result is that state-of-the-art recognition rates are obtained with fovea-based saccadic exploration, using less than 10% of the original image's data. Those satisfactory results illustrate the advantage of mixing predictive control with accurate state-of-the-art predictors, namely a deep neural network. A second result is the sub-optimality of some classical action-selection metrics widely used in the literature, that is not manifest with finely-tuned inference models, but becomes patent when coarse or faulty models are used. Last, a computationally-effective predictive model is developed using the IGLB objective, with pre-processed visual scan-path read-out from memory, bypassing computationally-demanding predictive calculations. This last simplified setting is shown effective in our case, showing both a competing accuracy and a good robustness to model flaws.

摘要

在没有明确奖励的情况下，是什么驱动了行为？以视觉运动控制为例，我们考虑一个最小控制问题，即在一系列离散眼动中，当最终目标是更好地解释当前视觉场景时，如何选择下一次扫视。这里将视觉场景建模为一个部分可观测的环境，使用一个生成模型来解释视觉数据是如何由行为塑造的。这使得我们能够在一个单一的信息理论框架下，即基于视图的信息增益，来解释文献中提出的不同行为选择指标，包括显著性、信息最大化和变分自由能。沿着这条分析路径，我们随后提出了两个原创的行为选择指标，即信息增益下界（IGLB）和信息增益上界（IGUB）。它们对信息增益分别表现出保守或乐观的偏差，极大地简化了其计算。然后我们提出了一种基于中央凹的视觉场景解码设置，并通过数值实验突出了基于人工中央凹视觉的不同方面。第一个也是主要的结果是，基于中央凹的扫视探索能够在使用不到原始图像数据10%的情况下获得当前最优的识别率。这些令人满意的结果说明了将预测控制与精确的当前最优预测器（即深度神经网络）相结合的优势。第二个结果是文献中广泛使用的一些经典行为选择指标的次优性，这种次优性在经过精细调整的推理模型中并不明显，但在使用粗糙或有缺陷的模型时就会显现出来。最后，我们使用IGLB目标开发了一个计算高效的预测模型，该模型从内存中读取预处理后的视觉扫描路径，绕过了计算量较大的预测计算。在我们的案例中，最后这种简化设置被证明是有效的，显示出了具有竞争力的准确性和对模型缺陷的良好鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f950/6302111/56552a5f5f4e/fnbot-12-00076-g0001.jpg

相似文献

Active Fovea-Based Vision Through Computationally-Effective Model-Based Prediction.通过基于计算高效模型的预测实现基于中央凹的主动视觉。

Front Neurorobot. 2018 Dec 14;12:76. doi: 10.3389/fnbot.2018.00076. eCollection 2018.

A Generative Model of Cognitive State from Task and Eye Movements.一种基于任务和眼动的认知状态生成模型。

Cognit Comput. 2018 Oct;10(5):703-717. doi: 10.1007/s12559-018-9558-9. Epub 2018 May 9.

Scene Construction, Visual Foraging, and Active Inference.场景构建、视觉觅食与主动推理

Front Comput Neurosci. 2016 Jun 14;10:56. doi: 10.3389/fncom.2016.00056. eCollection 2016.

Foveal Vision for Humanoid Robots类人机器人的中央凹视觉

Visual signals contribute to the coding of gaze direction.视觉信号有助于注视方向的编码。

Exp Brain Res. 2002 Jun;144(3):281-92. doi: 10.1007/s00221-002-1029-5. Epub 2002 Apr 13.

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.深度强化学习中用于自监督探索的变分动力学

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4776-4790. doi: 10.1109/TNNLS.2021.3129160. Epub 2023 Aug 4.

Motor intention activity in the macaque's lateral intraparietal area. I. Dissociation of motor plan from sensory memory.猕猴外侧顶内沟区的运动意图活动。I. 运动计划与感觉记忆的分离。

J Neurophysiol. 1996 Sep;76(3):1439-56. doi: 10.1152/jn.1996.76.3.1439.

Post-Saccadic Face Processing Is Modulated by Pre-Saccadic Preview: Evidence from Fixation-Related Potentials.眼跳后面孔加工受眼跳前预视调制：来自眼跳相关电位的证据。

J Neurosci. 2020 Mar 11;40(11):2305-2313. doi: 10.1523/JNEUROSCI.0861-19.2020. Epub 2020 Jan 30.

Central and peripheral vision for scene recognition: A neurocomputational modeling exploration.用于场景识别的中央和周边视觉：神经计算建模探索

J Vis. 2017 Apr 1;17(4):9. doi: 10.1167/17.4.9.

Saccadic gain modification: visual error drives motor adaptation.扫视增益修正：视觉误差驱动运动适应。

J Neurophysiol. 1998 Nov;80(5):2405-16. doi: 10.1152/jn.1998.80.5.2405.

引用本文的文献

A dual foveal-peripheral visual processing model implements efficient saccade selection.一种双中央凹-周边视觉处理模型实现了高效的扫视选择。

J Vis. 2020 Aug 3;20(8):22. doi: 10.1167/jov.20.8.22.

Embodied Object Representation Learning and Recognition.具身物体表征学习与识别

Front Neurorobot. 2022 Apr 14;16:840658. doi: 10.3389/fnbot.2022.840658. eCollection 2022.

Active Vision for Robot Manipulators Using the Free Energy Principle.基于自由能原理的机器人操纵器主动视觉

Front Neurorobot. 2021 Mar 5;15:642780. doi: 10.3389/fnbot.2021.642780. eCollection 2021.

An Embodied Agent Learning Affordances With Intrinsic Motivations and Solving Extrinsic Tasks With Attention and One-Step Planning.一个通过内在动机学习可供性并通过注意力和单步规划解决外在任务的具身智能体。

Front Neurorobot. 2019 Jul 26;13:45. doi: 10.3389/fnbot.2019.00045. eCollection 2019.

本文引用的文献

Active Inference: A Process Theory.主动推理：一种过程理论。

Neural Comput. 2017 Jan;29(1):1-49. doi: 10.1162/NECO_a_00912. Epub 2016 Nov 21.

Active inference and epistemic value.主动推理与认知价值。

Cogn Neurosci. 2015;6(4):187-214. doi: 10.1080/17588928.2015.1020053. Epub 2015 Mar 13.

Perceptions as hypotheses: saccades as experiments.作为假设的感知：作为实验的扫视。

Front Psychol. 2012 May 28;3:151. doi: 10.3389/fpsyg.2012.00151. eCollection 2012.

The free-energy principle: a unified brain theory?自由能原理：一个统一的大脑理论？

Nat Rev Neurosci. 2010 Feb;11(2):127-38. doi: 10.1038/nrn2787. Epub 2010 Jan 13.

A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression.一种新的多分辨率时空显著检测模型及其在图像和视频压缩中的应用。

IEEE Trans Image Process. 2010 Jan;19(1):185-98. doi: 10.1109/TIP.2009.2030969.

Simple summation rule for optimal fixation selection in visual search.视觉搜索中最优注视点选择的简单求和规则。

Vision Res. 2009 Jun;49(10):1286-94. doi: 10.1016/j.visres.2008.12.005. Epub 2009 Jan 10.

Foveation scalable video coding with automatic fixation selection.具有自动注视点选择的中央凹可扩展视频编码

IEEE Trans Image Process. 2003;12(2):243-54. doi: 10.1109/TIP.2003.809015.

Optimal eye movement strategies in visual search.视觉搜索中的最佳眼动策略

Nature. 2005 Mar 17;434(7031):387-91. doi: 10.1038/nature03390.

The role of execution noise in movement variability.执行噪声在运动变异性中的作用。

J Neurophysiol. 2004 Feb;91(2):1050-63. doi: 10.1152/jn.00652.2003. Epub 2003 Oct 15.

Computational modelling of visual attention.视觉注意力的计算建模。

Nat Rev Neurosci. 2001 Mar;2(3):194-203. doi: 10.1038/35058500.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过基于计算高效模型的预测实现基于中央凹的主动视觉。

Active Fovea-Based Vision Through Computationally-Effective Model-Based Prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献