OD-MVSNet：全维动态多视角立体网络。

OD-MVSNet: Omni-dimensional dynamic multi-view stereo network.

机构信息

College of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan, Shandong, China.

Shandong Zhengyuan Yeda Environmental Technology Co., Ltd., Jinan, Shandong, China.

出版信息

PLoS One. 2024 Aug 15;19(8):e0309029. doi: 10.1371/journal.pone.0309029. eCollection 2024.

DOI:10.1371/journal.pone.0309029

PMID:39146385

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11326553/

Abstract

Multi-view stereo based on learning is a critical task in three-dimensional reconstruction, enabling the effective inference of depth maps and the reconstruction of fine-grained scene geometry. However, the results obtained by current popular 3D reconstruction methods are not precise, and achieving high-accuracy scene reconstruction remains challenging due to the pervasive impact of feature extraction and the poor correlation between cost and volume. In addressing these issues, we propose a cascade deep residual inference network to enhance the efficiency and accuracy of multi-view stereo depth estimation. This approach builds a cost-volume pyramid from coarse to fine, generating a lightweight, compact network to improve reconstruction results. Specifically, we introduce the omni-dimensional dynamic atrous spatial pyramid pooling (OSPP), a multiscale feature extraction module capable of generating dense feature maps with multiscale contextual information. The feature maps encoded by the OSPP module can generate dense point clouds without consuming significant memory. Furthermore, to alleviate the issue of feature mismatch in cost volume regularization, we propose a normalization-based 3D attention module. The 3D attention module aggregates crucial information within the cost volume across the dimensions of channel, spatial, and depth. Through extensive experiments on benchmark datasets, notably DTU, we found that the OD-MVSNet model outperforms the baseline model by approximately 1.4% in accuracy loss, 0.9% in completeness loss, and 1.2% in overall loss, demonstrating the effectiveness of our module.

摘要

基于学习的多视角立体是三维重建中的一项关键任务，能够有效地推断深度图并重建精细的场景几何结构。然而，当前流行的 3D 重建方法得到的结果并不精确，由于特征提取的普遍影响和成本与体积之间的不良相关性，实现高精度的场景重建仍然具有挑战性。在解决这些问题时，我们提出了级联深度残差推理网络，以提高多视角立体深度估计的效率和准确性。这种方法从粗到细构建成本体积金字塔，生成一个轻量级、紧凑的网络，以改善重建结果。具体来说，我们引入了全维动态空洞空间金字塔池化（OSPP），这是一种多尺度特征提取模块，能够生成具有多尺度上下文信息的密集特征图。OSPP 模块编码的特征图可以生成密集的点云，而不会消耗大量的内存。此外，为了解决成本体积正则化中特征不匹配的问题，我们提出了基于归一化的 3D 注意力模块。3D 注意力模块在通道、空间和深度维度上聚合成本体积中的关键信息。通过在基准数据集上的大量实验，特别是 DTU，我们发现 OD-MVSNet 模型在准确性损失、完整性损失和总体损失方面分别比基线模型提高了约 1.4%、0.9%和 1.2%，证明了我们模块的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/595c/11326553/233a8eb2b5f4/pone.0309029.g001.jpg

相似文献

OD-MVSNet: Omni-dimensional dynamic multi-view stereo network.OD-MVSNet：全维动态多视角立体网络。

PLoS One. 2024 Aug 15;19(8):e0309029. doi: 10.1371/journal.pone.0309029. eCollection 2024.

DRI-MVSNet: A depth residual inference network for multi-view stereo images.DRI-MVSNet：一种用于多视角立体图像的深度残差推理网络。

PLoS One. 2022 Mar 23;17(3):e0264721. doi: 10.1371/journal.pone.0264721. eCollection 2022.

A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.一种具有面片不确定性感知的轻量级多视图立体方法。

Sensors (Basel). 2024 Feb 17;24(4):1293. doi: 10.3390/s24041293.

Visibility-Aware Point-Based Multi-View Stereo Network.基于可见性感知点的多视图立体视觉网络

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3695-3708. doi: 10.1109/TPAMI.2020.2988729. Epub 2021 Sep 2.

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo.基于成本体积金字塔的多视图立体深度推理

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4748-4760. doi: 10.1109/TPAMI.2021.3082562. Epub 2022 Aug 4.

Parallax attention stereo matching network based on the improved group-wise correlation stereo network.基于改进的分组相关立体网络的视差注意力立体匹配网络。

PLoS One. 2022 Feb 9;17(2):e0263735. doi: 10.1371/journal.pone.0263735. eCollection 2022.

BSI-MVS: multi-view stereo network with bidirectional semantic information.BSI-MVS：具有双向语义信息的多视图立体网络。

Sci Rep. 2024 Mar 21;14(1):6766. doi: 10.1038/s41598-024-55612-6.

A stereo matching algorithm based on the improved PSMNet.基于改进的 PSMNet 的立体匹配算法。

PLoS One. 2021 Aug 19;16(8):e0251657. doi: 10.1371/journal.pone.0251657. eCollection 2021.

NR-MVSNet: Learning Multi-View Stereo Based on Normal Consistency and Depth Refinement.NR-MVSNet：基于法向一致性和深度细化的多视图立体学习。

IEEE Trans Image Process. 2023;32:2649-2662. doi: 10.1109/TIP.2023.3272170. Epub 2023 May 12.

EI-MVSNet: Epipolar-Guided Multi-View Stereo Network With Interval-Aware Label.EI-MVSNet：具有区间感知标签的极线引导多视图立体视觉网络

IEEE Trans Image Process. 2024;33:753-766. doi: 10.1109/TIP.2023.3347929. Epub 2024 Jan 12.

本文引用的文献

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo.基于成本体积金字塔的多视图立体深度推理

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):4748-4760. doi: 10.1109/TPAMI.2021.3082562. Epub 2022 Aug 4.

Accurate, dense, and robust multiview stereopsis.精确、密集且鲁棒的多视图立体视觉。

IEEE Trans Pattern Anal Mach Intell. 2010 Aug;32(8):1362-76. doi: 10.1109/TPAMI.2009.161.

Control of goal-directed and stimulus-driven attention in the brain.大脑中目标导向性注意力和刺激驱动性注意力的控制。

Nat Rev Neurosci. 2002 Mar;3(3):201-15. doi: 10.1038/nrn755.

The interpretation of structure from motion.从运动中解读结构。

Proc R Soc Lond B Biol Sci. 1979 Jan 15;203(1153):405-26. doi: 10.1098/rspb.1979.0006.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

OD-MVSNet：全维动态多视角立体网络。

OD-MVSNet: Omni-dimensional dynamic multi-view stereo network.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献