IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5314-5334. doi: 10.1109/TPAMI.2021.3070917. Epub 2022 Aug 4.
Stereo matching is one of the longest-standing problems in computer vision with close to 40 years of studies and research. Throughout the years the paradigm has shifted from local, pixel-level decision to various forms of discrete and continuous optimization to data-driven, learning-based methods. Recently, the rise of machine learning and the rapid proliferation of deep learning enhanced stereo matching with new exciting trends and applications unthinkable until a few years ago. Interestingly, the relationship between these two worlds is two-way. While machine, and especially deep, learning advanced the state-of-the-art in stereo matching, stereo itself enabled new ground-breaking methodologies such as self-supervised monocular depth estimation based on deep networks. In this paper, we review recent research in the field of learning-based depth estimation from single and binocular images highlighting the synergies, the successes achieved so far and the open challenges the community is going to face in the immediate future.
立体匹配是计算机视觉中存在时间最长的问题之一,已有近 40 年的研究历史。多年来,该范例已经从局部的、基于像素的决策转变为各种形式的离散和连续优化,再到基于数据的、基于学习的方法。最近,机器学习的兴起和深度学习的快速普及为立体匹配带来了新的令人兴奋的趋势和应用,这些在几年前是难以想象的。有趣的是,这两个领域之间的关系是双向的。虽然机器,特别是深度学习,推动了立体匹配的技术发展,但立体本身也为新的突破性方法提供了可能,例如基于深度网络的自我监督单目深度估计。在本文中,我们回顾了基于单目和双目图像的学习深度估计领域的最新研究,重点介绍了协同作用、迄今为止取得的成功以及社区在不久的将来将要面临的开放挑战。