• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

双目深度估计中的主动视觉:一种自上而下的视角。

Active Vision in Binocular Depth Estimation: A Top-Down Perspective.

作者信息

Priorelli Matteo, Pezzulo Giovanni, Stoianov Ivilin Peev

机构信息

Institute of Cognitive Sciences and Technologies, National Research Council of Italy, 35137 Padova, Italy.

Institute of Cognitive Sciences and Technologies, National Research Council of Italy, 00185 Rome, Italy.

出版信息

Biomimetics (Basel). 2023 Sep 21;8(5):445. doi: 10.3390/biomimetics8050445.

DOI:10.3390/biomimetics8050445
PMID:37754196
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10526497/
Abstract

Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes' projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action-perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits.

摘要

深度估计是一个不适定问题;不同形状或尺寸的物体,即使处于不同距离,也可能在视网膜上投射出相同的图像。我们的大脑使用多种线索进行深度估计,包括单眼线索如运动视差和双眼线索如复视。然而,深度估计所需的计算如何以生物学上合理的方式实现仍不清楚。基于深度神经网络的深度估计的最新方法将大脑隐含地描述为一个分层特征检测器。相反,在本文中,我们提出了一种替代方法,将深度估计视为主动推理问题。我们表明,可以通过反转一个分层生成模型来推断深度,该模型同时从对物体的二维信念预测眼睛的投影。模型反演由一系列基于预测编码原理的生物学上合理的齐次变换组成。在中央凹分辨率不均匀这一合理假设下,深度估计有利于一种主动视觉策略,即用眼注视物体,使深度信念更准确。这种策略不是先注视目标然后估计深度来实现的;相反,它通过动作 - 感知循环将这两个过程结合起来,在物体识别过程中具有与扫视类似的机制。所提出的方法仅需要局部(自上而下和自下而上)消息传递,这可以在生物学上合理的神经回路中实现。

相似文献

1
Active Vision in Binocular Depth Estimation: A Top-Down Perspective.双目深度估计中的主动视觉:一种自上而下的视角。
Biomimetics (Basel). 2023 Sep 21;8(5):445. doi: 10.3390/biomimetics8050445.
2
The dichoptiscope: an instrument for investigating cues to motion in depth.双眼视镜:一种用于研究深度运动线索的仪器。
J Vis. 2013 Dec 2;13(14):1. doi: 10.1167/13.14.1.
3
Deep Learning-Based Monocular Depth Estimation Methods-A State-of-the-Art Review.基于深度学习的单目深度估计方法——最新综述。
Sensors (Basel). 2020 Apr 16;20(8):2272. doi: 10.3390/s20082272.
4
Object Detection and Depth Estimation Approach Based on Deep Convolutional Neural Networks.基于深度卷积神经网络的目标检测和深度估计方法。
Sensors (Basel). 2021 Jul 12;21(14):4755. doi: 10.3390/s21144755.
5
Monocular cues are superior to binocular cues for size perception when they are in conflict in virtual reality.在虚拟现实中,当单眼线索与双眼线索发生冲突时,单眼线索在大小感知方面优于双眼线索。
Cortex. 2023 Sep;166:80-90. doi: 10.1016/j.cortex.2023.05.010. Epub 2023 May 31.
6
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers.基于拉普拉斯图像金字塔和局部平面引导层的单目深度估计
Sensors (Basel). 2023 Jan 11;23(2):845. doi: 10.3390/s23020845.
7
Speed discrimination of motion-in-depth using binocular cues.利用双眼线索对深度运动进行速度辨别。
Vision Res. 1995 Apr;35(7):885-96. doi: 10.1016/0042-6989(94)00194-q.
8
Stereopsis Only: Validation of a Monocular Depth Cues Reduced Gamified Virtual Reality with Reaction Time Measurement.立体视仅:使用反应时测量的减少了单眼深度线索的游戏化虚拟现实的验证。
IEEE Trans Vis Comput Graph. 2022 May;28(5):2114-2124. doi: 10.1109/TVCG.2022.3150486. Epub 2022 Apr 8.
9
Binocular depth discrimination and estimation beyond interaction space.双眼深度辨别与交互空间之外的深度估计。
J Vis. 2009 Jan 13;9(1):10.1-14. doi: 10.1167/9.1.10.
10
Cue vetoing in depth estimation: Physical and virtual stimuli.深度估计中的线索否决:物理和虚拟刺激。
Vision Res. 2021 Nov;188:51-64. doi: 10.1016/j.visres.2021.07.003. Epub 2021 Jul 18.

引用本文的文献

1
Deep Hybrid Models: Infer and Plan in a Dynamic World.深度混合模型:在动态世界中进行推理与规划。
Entropy (Basel). 2025 May 27;27(6):570. doi: 10.3390/e27060570.
2
Embodied decisions as active inference.作为主动推理的具身决策。
PLoS Comput Biol. 2025 Jun 18;21(6):e1013180. doi: 10.1371/journal.pcbi.1013180. eCollection 2025 Jun.
3
Pose Estimation of a Cobot Implemented on a Small AI-Powered Computing System and a Stereo Camera for Precision Evaluation.在小型人工智能驱动的计算系统和立体相机上实现的协作机器人姿态估计用于精度评估

本文引用的文献

1
Deep kinematic inference affords efficient and scalable control of bodily movements.深度运动学推理为身体运动提供了高效且可扩展的控制。
Proc Natl Acad Sci U S A. 2023 Dec 19;120(51):e2309058120. doi: 10.1073/pnas.2309058120. Epub 2023 Dec 12.
2
Flexible intentions: An Active Inference theory.灵活意图:一种主动推理理论。
Front Comput Neurosci. 2023 Mar 20;17:1128694. doi: 10.3389/fncom.2023.1128694. eCollection 2023.
3
Reclaiming saliency: Rhythmic precision-modulated action and perception.恢复显著性:节奏精确调制的动作与感知。
Biomimetics (Basel). 2024 Oct 9;9(10):610. doi: 10.3390/biomimetics9100610.
4
Deep kinematic inference affords efficient and scalable control of bodily movements.深度运动学推理为身体运动提供了高效且可扩展的控制。
Proc Natl Acad Sci U S A. 2023 Dec 19;120(51):e2309058120. doi: 10.1073/pnas.2309058120. Epub 2023 Dec 12.
Front Neurorobot. 2022 Jul 28;16:896229. doi: 10.3389/fnbot.2022.896229. eCollection 2022.
4
Predictive Coding Approximates Backprop Along Arbitrary Computation Graphs.预测编码可沿任意计算图逼近反向传播。
Neural Comput. 2022 May 19;34(6):1329-1368. doi: 10.1162/neco_a_01497.
5
Active inference through whiskers.通过触须进行主动推理。
Neural Netw. 2021 Dec;144:428-437. doi: 10.1016/j.neunet.2021.08.037. Epub 2021 Sep 9.
6
Robot navigation as hierarchical active inference.机器人导航作为分层主动推理。
Neural Netw. 2021 Oct;142:192-204. doi: 10.1016/j.neunet.2021.05.010. Epub 2021 May 10.
7
Does vision extract absolute distance from vergence?视觉是否从辐辏中提取绝对距离?
Atten Percept Psychophys. 2020 Aug;82(6):3176-3195. doi: 10.3758/s13414-020-02006-1.
8
Bayesian Filtering with Multiple Internal Models: Toward a Theory of Social Intelligence.多内部模型贝叶斯滤波:迈向社会智能理论。
Neural Comput. 2019 Dec;31(12):2390-2431. doi: 10.1162/neco_a_01239. Epub 2019 Oct 15.
9
A Novel Predictive-Coding-Inspired Variational RNN Model for Online Prediction and Recognition.一种基于新型预测编码的变分 RNN 模型,用于在线预测和识别。
Neural Comput. 2019 Nov;31(11):2025-2074. doi: 10.1162/neco_a_01228. Epub 2019 Sep 16.
10
Theories of Error Back-Propagation in the Brain.大脑中的误差反向传播理论。
Trends Cogn Sci. 2019 Mar;23(3):235-250. doi: 10.1016/j.tics.2018.12.005. Epub 2019 Jan 28.