Suppr超能文献

通过基于计算高效模型的预测实现基于中央凹的主动视觉。

Active Fovea-Based Vision Through Computationally-Effective Model-Based Prediction.

作者信息

Daucé Emmanuel

机构信息

Ecole Centrale de Marseille, INSERM, Institut de Neurosciences des Systèmes, Aix Marseille Université, Marseille, France.

出版信息

Front Neurorobot. 2018 Dec 14;12:76. doi: 10.3389/fnbot.2018.00076. eCollection 2018.

Abstract

What motivates an action in the absence of a definite reward? Taking the case of visuomotor control, we consider a minimal control problem that is how select the next saccade, in a sequence of discrete eye movements, when the final objective is to better interpret the current visual scene. The visual scene is modeled here as a partially-observed environment, with a generative model explaining how the visual data is shaped by action. This allows to interpret different action selection metrics proposed in the literature, including the Salience, the Infomax and the Variational Free Energy, under a single information theoretic construct, namely the view-based Information Gain. Pursuing this analytic track, two original action selection metrics named the Information Gain Lower Bound (IGLB) and the Information Gain Upper Bound (IGUB) are then proposed. Showing either a conservative or an optimistic bias regarding the Information Gain, they strongly simplify its calculation. An original fovea-based visual scene decoding setup is then proposed, with numerical experiments highlighting different facets of artificial fovea-based vision. A first and principal result is that state-of-the-art recognition rates are obtained with fovea-based saccadic exploration, using less than 10% of the original image's data. Those satisfactory results illustrate the advantage of mixing predictive control with accurate state-of-the-art predictors, namely a deep neural network. A second result is the sub-optimality of some classical action-selection metrics widely used in the literature, that is not manifest with finely-tuned inference models, but becomes patent when coarse or faulty models are used. Last, a computationally-effective predictive model is developed using the IGLB objective, with pre-processed visual scan-path read-out from memory, bypassing computationally-demanding predictive calculations. This last simplified setting is shown effective in our case, showing both a competing accuracy and a good robustness to model flaws.

摘要

在没有明确奖励的情况下,是什么驱动了行为?以视觉运动控制为例,我们考虑一个最小控制问题,即在一系列离散眼动中,当最终目标是更好地解释当前视觉场景时,如何选择下一次扫视。这里将视觉场景建模为一个部分可观测的环境,使用一个生成模型来解释视觉数据是如何由行为塑造的。这使得我们能够在一个单一的信息理论框架下,即基于视图的信息增益,来解释文献中提出的不同行为选择指标,包括显著性、信息最大化和变分自由能。沿着这条分析路径,我们随后提出了两个原创的行为选择指标,即信息增益下界(IGLB)和信息增益上界(IGUB)。它们对信息增益分别表现出保守或乐观的偏差,极大地简化了其计算。然后我们提出了一种基于中央凹的视觉场景解码设置,并通过数值实验突出了基于人工中央凹视觉的不同方面。第一个也是主要的结果是,基于中央凹的扫视探索能够在使用不到原始图像数据10%的情况下获得当前最优的识别率。这些令人满意的结果说明了将预测控制与精确的当前最优预测器(即深度神经网络)相结合的优势。第二个结果是文献中广泛使用的一些经典行为选择指标的次优性,这种次优性在经过精细调整的推理模型中并不明显,但在使用粗糙或有缺陷的模型时就会显现出来。最后,我们使用IGLB目标开发了一个计算高效的预测模型,该模型从内存中读取预处理后的视觉扫描路径,绕过了计算量较大的预测计算。在我们的案例中,最后这种简化设置被证明是有效的,显示出了具有竞争力的准确性和对模型缺陷的良好鲁棒性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f950/6302111/56552a5f5f4e/fnbot-12-00076-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验