School of Optometry and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720-2020, United States.
School of Optometry and the Helen Wills Neuroscience Institute, University of California, Berkeley, Berkeley, CA 94720-2020, United States.
Vision Res. 2021 Mar;180:11-36. doi: 10.1016/j.visres.2020.11.009. Epub 2020 Dec 21.
We describe a new unified model to explain both binocular fusion and depth perception, over a broad range of depths. At each location, the model consists of an array of paired spatial frequency filters, with different relative horizontal shifts (position disparity) and interocular phase disparities of 0, 90, ±180, or -90°. The paired filters with different spatial profiles (non-zero phase disparity) compute interocular misalignment and provide phase-disparity energy (binocular fusion energy) to drive selection of the appropriate filters along the position disparity space until the misalignment is eliminated and sensory fusion is achieved locally. The paired filters with identical spatial profiles (0 phase disparity) compute the position-disparity energy. After sensory fusion, the combination of position and possible residual phase disparity energies is calculated for binocular depth perception. Binocular fusion occurs at multiple scales following a coarse-to-fine process. At a given location, the apparent depth is the weighted sum of fusion shifts combined with residual phase disparity in all spatial-frequency channels, and the weights depend on stimulus spatial frequency and stimulus contrast. To test the theory, we measured disparity minimum and maximum thresholds (Dmin and Dmax) at three spatial frequencies and with different intraocular contrast levels. The stimuli were Random-Gabor-Patch (RGP) stereograms consisting of Gabor patches with random positions and phases, but with a fixed spatial frequency. The two eyes viewed identical arrays of patches except that one eye's array could be shifted horizontally and could differ in contrast. Our experiments and modeling reveal two contrast normalization mechanisms: (1) Energy Normalization (EN): Binocular energy is normalized with monocular energy after the site of binocular combination. This predicts constant Dmin thresholds when varying stimulus contrast in the two eyes; (2) DSKL model Interocular interactions: Monocular contrasts are normalized before the binocular combination site through interocular contrast gain-control and gain-enhancement mechanisms. This predicts contrast dependent Dmax thresholds. We tested a range of models and found that a model consisting of a second-order pathway with DSKL interocular interactions and a first-order pathway with EN at each spatial-frequency band can account for both the Dmin and Dmax data very well. Simulations show that the model makes reasonable predictions of suprathreshold depth perception.
我们描述了一种新的统一模型,可用于解释宽范围深度的双眼融合和深度感知。在每个位置,该模型由一组配对的空间频率滤波器组成,具有不同的相对水平位移(位置视差)和 0、90、±180 或-90°的眼间相位差。具有不同空间轮廓(非零相位差)的配对滤波器计算眼间失准,并提供相位差能量(双眼融合能量),以沿位置视差空间选择适当的滤波器,直到消除失准并实现局部感觉融合。具有相同空间轮廓(0 相位差)的配对滤波器计算位置视差能量。感觉融合后,计算用于双眼深度感知的位置和可能的残余相位差能量的组合。双眼融合是一个从粗到细的过程,在多个尺度上发生。在给定位置,视差的明显深度是融合位移的加权和与所有空间频率通道中残余相位差的组合,权重取决于刺激空间频率和刺激对比度。为了验证理论,我们在三个空间频率和不同的眼内对比度水平下测量了视差最小和最大阈值(Dmin 和 Dmax)。刺激是由具有随机位置和相位的 Gabor 补丁组成的随机 Gabor 补丁立体图(RGP),但具有固定的空间频率。两只眼睛观看相同的补丁阵列,只是一只眼睛的阵列可以水平移动,并且对比度不同。我们的实验和建模揭示了两种对比归一化机制:(1)能量归一化(EN):在双眼组合部位之后,双眼能量用单眼能量归一化。这预测了当两只眼睛中的刺激对比度变化时,Dmin 阈值保持恒定; (2)DSKL 模型眼间相互作用:在双眼组合部位之前,通过眼间对比度增益控制和增益增强机制对单眼对比度进行归一化。这预测了对比度依赖的 Dmax 阈值。我们测试了一系列模型,发现一个由具有 DSKL 眼间相互作用的二阶通路和每个空间频带的 EN 组成的模型可以很好地解释 Dmin 和 Dmax 数据。模拟表明,该模型对视上阈深度感知做出了合理的预测。