Hillis James M, Watt Simon J, Landy Michael S, Banks Martin S
Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA.
J Vis. 2004 Dec 1;4(12):967-92. doi: 10.1167/4.12.1.
How does the visual system combine information from different depth cues to estimate three-dimensional scene parameters? We tested a maximum-likelihood estimation (MLE) model of cue combination for perspective (texture) and binocular disparity cues to surface slant. By factoring the reliability of each cue into the combination process, MLE provides more reliable estimates of slant than would be available from either cue alone. We measured the reliability of each cue in isolation across a range of slants and distances using a slant-discrimination task. The reliability of the texture cue increases as |slant| increases and does not change with distance. The reliability of the disparity cue decreases as distance increases and varies with slant in a way that also depends on viewing distance. The trends in the single-cue data can be understood in terms of the information available in the retinal images and issues related to solving the binocular correspondence problem. To test the MLE model, we measured perceived slant of two-cue stimuli when disparity and texture were in conflict and the reliability of slant estimation when both cues were available. Results from the two-cue study indicate, consistent with the MLE model, that observers weight each cue according to its relative reliability: Disparity weight decreased as distance and |slant| increased. We also observed the expected improvement in slant estimation when both cues were available. With few discrepancies, our data indicate that observers combine cues in a statistically optimal fashion and thereby reduce the variance of slant estimates below that which could be achieved from either cue alone. These results are consistent with other studies that quantitatively examined the MLE model of cue combination. Thus, there is a growing empirical consensus that MLE provides a good quantitative account of cue combination and that sensory information is used in a manner that maximizes the precision of perceptual estimates.
视觉系统是如何将来自不同深度线索的信息进行整合,以估计三维场景参数的?我们测试了一种用于视角(纹理)和双目视差线索以估计表面倾斜度的线索组合的最大似然估计(MLE)模型。通过在组合过程中考虑每个线索的可靠性,MLE提供的倾斜度估计比单独使用任何一个线索都更可靠。我们使用倾斜辨别任务,在一系列倾斜度和距离上单独测量了每个线索的可靠性。纹理线索的可靠性随着|倾斜度|的增加而增加,并且不随距离变化。视差线索的可靠性随着距离的增加而降低,并且以一种也取决于观察距离的方式随倾斜度变化。单线索数据中的这些趋势可以根据视网膜图像中可用的信息以及与解决双目对应问题相关的问题来理解。为了测试MLE模型,我们测量了视差和纹理冲突时双线索刺激的感知倾斜度,以及两个线索都可用时倾斜度估计的可靠性。双线索研究的结果表明,与MLE模型一致,观察者根据每个线索的相对可靠性对其进行加权:视差权重随着距离和|倾斜度|的增加而降低。当两个线索都可用时,我们还观察到了倾斜度估计方面预期的改善。几乎没有差异,我们的数据表明观察者以统计上最优的方式组合线索,从而将倾斜度估计的方差降低到单独使用任何一个线索所能达到的方差以下。这些结果与其他定量研究线索组合的MLE模型的研究一致。因此,越来越多的实证共识是,MLE为线索组合提供了一个很好的定量解释,并且感觉信息的使用方式使感知估计的精度最大化。