College of Computer Science and Technology, Jilin University, Changchun, China.
Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.
PLoS One. 2022 Mar 23;17(3):e0264721. doi: 10.1371/journal.pone.0264721. eCollection 2022.
Three-dimensional (3D) image reconstruction is an important field of computer vision for restoring the 3D geometry of a given scene. Due to the demand for large amounts of memory, prevalent methods of 3D reconstruction yield inaccurate results, because of which the highly accuracy reconstruction of a scene remains an outstanding challenge. This study proposes a cascaded depth residual inference network, called DRI-MVSNet, that uses a cross-view similarity-based feature map fusion module for residual inference. It involves three improvements. First, a combined module is used for processing channel-related and spatial information to capture the relevant contextual information and improve feature representation. It combines the channel attention mechanism and spatial pooling networks. Second, a cross-view similarity-based feature map fusion module is proposed that learns the similarity between pairs of pixel in each source and reference image at planes of different depths along the frustum of the reference camera. Third, a deep, multi-stage residual prediction module is designed to generate a high-precision depth map that uses a non-uniform depth sampling strategy to construct hypothetical depth planes. The results of extensive experiments show that DRI-MVSNet delivers competitive performance on the DTU and the Tanks & Temples datasets, and the accuracy and completeness of the point cloud reconstructed by it are significantly superior to those of state-of-the-art benchmarks.
三维(3D)图像重建是计算机视觉领域的一个重要分支,用于恢复给定场景的 3D 几何形状。由于需要大量的内存,目前主流的 3D 重建方法的结果不够准确,因此场景的高精度重建仍然是一个具有挑战性的问题。本研究提出了一种级联深度残差推理网络,称为 DRI-MVSNet,它使用基于跨视图相似性的特征图融合模块进行残差推理。该方法有三个改进点。首先,使用了一个组合模块来处理与通道相关和空间信息,以捕获相关的上下文信息并改进特征表示。它结合了通道注意力机制和空间池化网络。其次,提出了一种基于跨视图相似性的特征图融合模块,用于学习参考相机光锥中不同深度平面上每对源和参考图像中像素之间的相似性。第三,设计了一个深度、多阶段的残差预测模块,用于生成高精度的深度图。该模块使用非均匀深度采样策略来构建假设深度平面。在广泛的实验结果表明,DRI-MVSNet 在 DTU 和 Tanks & Temples 数据集上表现出了有竞争力的性能,并且它重建的点云的准确性和完整性明显优于最新的基准。