IEEE Trans Image Process. 2019 Mar;28(3):1299-1313. doi: 10.1109/TIP.2018.2878325. Epub 2018 Oct 26.
We present a deep architecture that estimates a stereo confidence, which is essential for improving the accuracy of stereo matching algorithms. In contrast to existing methods based on deep convolutional neural networks (CNNs) that rely on only one of the matching cost volume or estimated disparity map, our network estimates the stereo confidence by using the two heterogeneous inputs simultaneously. Specifically, the matching probability volume is first computed from the matching cost volume with residual networks and a pooling module in a manner that yields greater robustness. The confidence is then estimated through a unified deep network that combines confidence features extracted both from the matching probability volume and its corresponding disparity. In addition, our method extracts the confidence features of the disparity map by applying multiple convolutional filters with varying sizes to an input disparity map. To learn our networks in a semi-supervised manner, we propose a novel loss function that use confident points to compute the image reconstruction loss. To validate the effectiveness of our method in a disparity post-processing step, we employ three post-processing approaches; cost modulation, ground control points-based propagation, and aggregated ground control points-based propagation. Experimental results demonstrate that our method outperforms state-of-the-art confidence estimation methods on various benchmarks.
我们提出了一种深度架构,用于估计立体置信度,这对于提高立体匹配算法的准确性至关重要。与现有的基于深度卷积神经网络 (CNN) 的方法不同,这些方法仅依赖于匹配成本量或估计的视差图之一,我们的网络通过同时使用两个异构输入来估计立体置信度。具体来说,首先通过残差网络和池化模块从匹配成本量计算匹配概率量,从而提高了鲁棒性。然后通过一个统一的深度网络来估计置信度,该网络结合了从匹配概率量及其对应的视差中提取的置信度特征。此外,我们的方法通过对输入视差图应用多个具有不同大小的卷积滤波器来提取视差图的置信度特征。为了以半监督的方式学习我们的网络,我们提出了一种新的损失函数,该函数使用置信点来计算图像重建损失。为了验证我们的方法在视差后处理步骤中的有效性,我们采用了三种后处理方法:成本调制、基于地面控制点的传播和基于聚合地面控制点的传播。实验结果表明,我们的方法在各种基准上优于最新的置信度估计方法。