DRI-MVSNet：一种用于多视角立体图像的深度残差推理网络。

DRI-MVSNet: A depth residual inference network for multi-view stereo images.

机构信息

College of Computer Science and Technology, Jilin University, Changchun, China.

Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China.

出版信息

PLoS One. 2022 Mar 23;17(3):e0264721. doi: 10.1371/journal.pone.0264721. eCollection 2022.

DOI:10.1371/journal.pone.0264721

PMID:35320265

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8942269/

Abstract

Three-dimensional (3D) image reconstruction is an important field of computer vision for restoring the 3D geometry of a given scene. Due to the demand for large amounts of memory, prevalent methods of 3D reconstruction yield inaccurate results, because of which the highly accuracy reconstruction of a scene remains an outstanding challenge. This study proposes a cascaded depth residual inference network, called DRI-MVSNet, that uses a cross-view similarity-based feature map fusion module for residual inference. It involves three improvements. First, a combined module is used for processing channel-related and spatial information to capture the relevant contextual information and improve feature representation. It combines the channel attention mechanism and spatial pooling networks. Second, a cross-view similarity-based feature map fusion module is proposed that learns the similarity between pairs of pixel in each source and reference image at planes of different depths along the frustum of the reference camera. Third, a deep, multi-stage residual prediction module is designed to generate a high-precision depth map that uses a non-uniform depth sampling strategy to construct hypothetical depth planes. The results of extensive experiments show that DRI-MVSNet delivers competitive performance on the DTU and the Tanks & Temples datasets, and the accuracy and completeness of the point cloud reconstructed by it are significantly superior to those of state-of-the-art benchmarks.

摘要

三维（3D）图像重建是计算机视觉领域的一个重要分支，用于恢复给定场景的 3D 几何形状。由于需要大量的内存，目前主流的 3D 重建方法的结果不够准确，因此场景的高精度重建仍然是一个具有挑战性的问题。本研究提出了一种级联深度残差推理网络，称为 DRI-MVSNet，它使用基于跨视图相似性的特征图融合模块进行残差推理。该方法有三个改进点。首先，使用了一个组合模块来处理与通道相关和空间信息，以捕获相关的上下文信息并改进特征表示。它结合了通道注意力机制和空间池化网络。其次，提出了一种基于跨视图相似性的特征图融合模块，用于学习参考相机光锥中不同深度平面上每对源和参考图像中像素之间的相似性。第三，设计了一个深度、多阶段的残差预测模块，用于生成高精度的深度图。该模块使用非均匀深度采样策略来构建假设深度平面。在广泛的实验结果表明，DRI-MVSNet 在 DTU 和 Tanks & Temples 数据集上表现出了有竞争力的性能，并且它重建的点云的准确性和完整性明显优于最新的基准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e862/8942269/a95f3f3d565c/pone.0264721.g001.jpg

相似文献

DRI-MVSNet: A depth residual inference network for multi-view stereo images.

PLoS One. 2022 Mar 23;17(3):e0264721. doi: 10.1371/journal.pone.0264721. eCollection 2022.

OD-MVSNet: Omni-dimensional dynamic multi-view stereo network.

PLoS One. 2024 Aug 15;19(8):e0309029. doi: 10.1371/journal.pone.0309029. eCollection 2024.

NR-MVSNet: Learning Multi-View Stereo Based on Normal Consistency and Depth Refinement.

IEEE Trans Image Process. 2023;32:2649-2662. doi: 10.1109/TIP.2023.3272170. Epub 2023 May 12.

Enhanced multi view 3D reconstruction with improved MVSNet.

Sci Rep. 2024 Jun 19;14(1):14106. doi: 10.1038/s41598-024-64805-y.

Visibility-Aware Point-Based Multi-View Stereo Network.

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3695-3708. doi: 10.1109/TPAMI.2020.2988729. Epub 2021 Sep 2.

EI-MVSNet: Epipolar-Guided Multi-View Stereo Network With Interval-Aware Label.

IEEE Trans Image Process. 2024;33:753-766. doi: 10.1109/TIP.2023.3347929. Epub 2024 Jan 12.

Miper-MVS: Multi-scale iterative probability estimation with refinement for efficient multi-view stereo.

Neural Netw. 2023 May;162:502-515. doi: 10.1016/j.neunet.2023.03.012. Epub 2023 Mar 17.

A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

Sensors (Basel). 2024 Feb 17;24(4):1293. doi: 10.3390/s24041293.

BSI-MVS: multi-view stereo network with bidirectional semantic information.

Sci Rep. 2024 Mar 21;14(1):6766. doi: 10.1038/s41598-024-55612-6.

RayMVSNet++: Learning Ray-Based 1D Implicit Fields for Accurate Multi-View Stereo.

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13666-13682. doi: 10.1109/TPAMI.2023.3296163. Epub 2023 Oct 3.

本文引用的文献

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo.

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4748-4760. doi: 10.1109/TPAMI.2021.3082562. Epub 2022 Aug 4.

SurfaceNet+: An End-to-end 3D Neural Network for Very Sparse Multi-View Stereopsis.

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4078-4093. doi: 10.1109/TPAMI.2020.2996798. Epub 2021 Oct 1.

Visibility-Aware Point-Based Multi-View Stereo Network.

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3695-3708. doi: 10.1109/TPAMI.2020.2988729. Epub 2021 Sep 2.

Autonomous Vehicles: Disengagements, Accidents and Reaction Times.

PLoS One. 2016 Dec 20;11(12):e0168054. doi: 10.1371/journal.pone.0168054. eCollection 2016.

Accurate multiple view 3D reconstruction using patch-based stereo for large-scale scenes.

IEEE Trans Image Process. 2013 May;22(5):1901-14. doi: 10.1109/TIP.2013.2237921. Epub 2013 Jan 10.

Bistable percepts in the brain: FMRI contrasts monocular pattern rivalry and binocular rivalry.

PLoS One. 2011;6(5):e20367. doi: 10.1371/journal.pone.0020367. Epub 2011 May 23.

Addressing overutilization in medical imaging.

Radiology. 2010 Oct;257(1):240-5. doi: 10.1148/radiol.10100063. Epub 2010 Aug 24.

Accurate, dense, and robust multiview stereopsis.

IEEE Trans Pattern Anal Mach Intell. 2010 Aug;32(8):1362-76. doi: 10.1109/TPAMI.2009.161.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DRI-MVSNet：一种用于多视角立体图像的深度残差推理网络。

DRI-MVSNet: A depth residual inference network for multi-view stereo images.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献