Priorelli Matteo, Pezzulo Giovanni, Stoianov Ivilin Peev
Institute of Cognitive Sciences and Technologies, National Research Council of Italy, 35137 Padova, Italy.
Institute of Cognitive Sciences and Technologies, National Research Council of Italy, 00185 Rome, Italy.
Biomimetics (Basel). 2023 Sep 21;8(5):445. doi: 10.3390/biomimetics8050445.
Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes' projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action-perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits.
深度估计是一个不适定问题;不同形状或尺寸的物体,即使处于不同距离,也可能在视网膜上投射出相同的图像。我们的大脑使用多种线索进行深度估计,包括单眼线索如运动视差和双眼线索如复视。然而,深度估计所需的计算如何以生物学上合理的方式实现仍不清楚。基于深度神经网络的深度估计的最新方法将大脑隐含地描述为一个分层特征检测器。相反,在本文中,我们提出了一种替代方法,将深度估计视为主动推理问题。我们表明,可以通过反转一个分层生成模型来推断深度,该模型同时从对物体的二维信念预测眼睛的投影。模型反演由一系列基于预测编码原理的生物学上合理的齐次变换组成。在中央凹分辨率不均匀这一合理假设下,深度估计有利于一种主动视觉策略,即用眼注视物体,使深度信念更准确。这种策略不是先注视目标然后估计深度来实现的;相反,它通过动作 - 感知循环将这两个过程结合起来,在物体识别过程中具有与扫视类似的机制。所提出的方法仅需要局部(自上而下和自下而上)消息传递,这可以在生物学上合理的神经回路中实现。