一种基于全局点特征加和深度宽残差网络的三维目标分类与检索深度学习方法。

A Deep Learning Method for 3D Object Classification and Retrieval Using the Global Point Signature Plus and Deep Wide Residual Network.

机构信息

Department of Artificial Intelligence Convergence, Pukyong National University, Busan 48513, Korea.

Department of Computer Engineering, Dong-A University, Busan 49315, Korea.

出版信息

Sensors (Basel). 2021 Apr 9;21(8):2644. doi: 10.3390/s21082644.

DOI:10.3390/s21082644

PMID:33918845

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8070544/

Abstract

A vital and challenging task in computer vision is 3D Object Classification and Retrieval, with many practical applications such as an intelligent robot, autonomous driving, multimedia contents processing and retrieval, and augmented/mixed reality. Various deep learning methods were introduced for solving classification and retrieval problems of 3D objects. Almost all view-based methods use many views to handle spatial loss, although they perform the best among current techniques such as View-based, Voxelization, and Point Cloud methods. Many views make network structure more complicated due to the parallel Convolutional Neural Network (CNN). We propose a novel method that combines a Global Point Signature Plus with a Deep Wide Residual Network, namely GPSP-DWRN, in this paper. Global Point Signature Plus (GPSPlus) is a novel descriptor because it can capture more shape information of the 3D object for a single view. First, an original 3D model was converted into a colored one by applying GPSPlus. Then, a 32 × 32 × 3 matrix stored the obtained 2D projection of this color 3D model. This matrix was the input data of a Deep Residual Network, which used a single CNN structure. We evaluated the GPSP-DWRN for a retrieval task using the Shapnetcore55 dataset, while using two well-known datasets-ModelNet10 and ModelNet40 for a classification task. Based on our experimental results, our framework performed better than the state-of-the-art methods.

摘要

计算机视觉中的一个重要且具有挑战性的任务是 3D 对象分类和检索，它有许多实际应用，如智能机器人、自动驾驶、多媒体内容处理和检索、增强/混合现实等。各种深度学习方法被引入来解决 3D 对象的分类和检索问题。几乎所有基于视图的方法都使用许多视图来处理空间损失，尽管它们在当前的技术（如基于视图、体素化和点云的方法）中表现最好。由于并行卷积神经网络 (CNN)，许多视图使网络结构更加复杂。在本文中，我们提出了一种新的方法，将全局点签名与深度宽残差网络相结合，即 GPSP-DWRN。全局点签名加 (GPSPlus) 是一种新颖的描述符，因为它可以为单个视图捕获更多的 3D 对象形状信息。首先，通过应用 GPSPlus 将原始 3D 模型转换为彩色模型。然后，一个 32x32x3 矩阵存储了这个彩色 3D 模型的获得的 2D 投影。这个矩阵是深度残差网络的输入数据，该网络使用单个 CNN 结构。我们使用 Shapenetcore55 数据集评估了 GPSP-DWRN 的检索任务，同时使用两个著名的数据集 ModelNet10 和 ModelNet40 进行分类任务。根据我们的实验结果，我们的框架表现优于最先进的方法。