IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7525-7541. doi: 10.1109/TPAMI.2022.3221785. Epub 2023 May 5.
View-based approach that recognizes 3D shape through its projected 2D images has achieved state-of-the-art results for 3D shape recognition. The major challenges are how to aggregate multi-view features and deal with 3D shapes in arbitrary poses. We propose two versions of a novel view-based Graph Convolutional Network, dubbed view-GCN and view-GCN++, to recognize 3D shape based on graph representation of multiple views. We first construct view-graph with multiple views as graph nodes, then design two graph convolutional networks over the view-graph to hierarchically learn discriminative shape descriptor considering relations of multiple views. Specifically, view-GCN is a hierarchical network based on two pivotal operations, i.e., feature transform based on local positional and non-local graph convolution, and graph coarsening based on a selective view-sampling operation. To deal with rotation sensitivity, we further propose view-GCN++ with local attentional graph convolution operation and rotation robust view-sampling operation for graph coarsening. By these designs, view-GCN++ achieves invariance to transformations under the finite subgroup of rotation group SO(3). Extensive experiments on benchmark datasets (i.e., ModelNet40, ScanObjectNN, RGBD and ShapeNet Core55) show that view-GCN and view-GCN++ achieve state-of-the-art results for 3D shape classification and retrieval tasks under aligned and rotated settings.
基于视图的方法通过其投影的二维图像来识别三维形状,已经在三维形状识别方面取得了最先进的成果。主要的挑战是如何聚合多视图特征并处理任意姿势的三维形状。我们提出了两种新颖的基于视图的图卷积网络版本,分别称为视图-GCN 和视图-GCN++,以基于多视图的图表示来识别三维形状。我们首先用多个视图构建视图图作为图节点,然后在视图图上设计两个图卷积网络,以分层学习考虑多视图关系的有判别力的形状描述符。具体来说,视图-GCN 是一个基于两个关键操作的分层网络,即基于局部位置和非局部图卷积的特征变换,以及基于选择性视图采样操作的图细化。为了处理旋转敏感性,我们进一步提出了视图-GCN++,它具有局部注意力图卷积操作和旋转鲁棒视图采样操作,用于图细化。通过这些设计,视图-GCN++在旋转群 SO(3)的有限子群下实现了对变换的不变性。在基准数据集(即 ModelNet40、ScanObjectNN、RGBD 和 ShapeNet Core55)上的广泛实验表明,视图-GCN 和视图-GCN++在对齐和旋转设置下,在三维形状分类和检索任务中取得了最先进的结果。