Li Jiaxin, Wang Hongxing, Tan Jiawei, Yuan Junsong
IEEE Trans Image Process. 2024;33:3564-3577. doi: 10.1109/TIP.2024.3404234. Epub 2024 Jun 4.
Part-level 3D shape representations are crucial to shape reasoning and understanding. Two key sub-tasks are: 1) shape abstraction, creating primitive-based object parts; and 2) shape segmentation, finding partition-based object parts. However, for 3D object point clouds, most advanced methods produce parts relying on task-specific priors, such as similarity metrics and primitive geometries, resulting in misleading parts that deviate from semantics. To address prior limitations, we establish a foundation for joint shape abstraction and shape segmentation as formal linear transformations within a shared latent space, encapsulating essential dual-purpose membership information linking points and object parts for mutual reinforcement. We demonstrate that the transformations are underpinned by a derivation based on k-means, Non-negative Matrix Factorization (NMF), and the attention mechanism. As a result, we introduce Latent Membership Pursuit (LMP) for joint optimization of shape abstraction and segmentation. LMP utilizes a shared latent representation of object part membership to autonomously identify common object parts in both tasks without any supervision and priors. Furthermore, we adapt deformable superquadrics (DSQs) for primitives to capture variable part-level geometric and semantic information. Experiments on benchmark datasets validate that our approach enables mutual learning of shape abstraction and segmentation, and promotes consistent interpretations of 3D object shapes across instances and even categories in a fully unsupervised manner.
部件级3D形状表示对于形状推理和理解至关重要。两个关键子任务是:1)形状抽象,创建基于基元的对象部件;2)形状分割,找到基于划分的对象部件。然而,对于3D对象点云,大多数先进方法生成部件时依赖于特定任务的先验知识,如相似性度量和基元几何形状,导致偏离语义的误导性部件。为了解决先前的局限性,我们在共享潜在空间内将联合形状抽象和形状分割建立为形式线性变换的基础,封装连接点和对象部件以相互强化的基本两用成员信息。我们证明这些变换由基于k均值、非负矩阵分解(NMF)和注意力机制的推导所支撑。结果,我们引入潜在成员追踪(LMP)用于形状抽象和分割的联合优化。LMP利用对象部件成员的共享潜在表示在无任何监督和先验知识的情况下自主识别两个任务中的常见对象部件。此外,我们将可变形超二次曲面(DSQ)用作基元来捕获可变部件级几何和语义信息。在基准数据集上的实验验证了我们的方法能够实现形状抽象和分割的相互学习,并以完全无监督的方式促进跨实例甚至跨类别的3D对象形状的一致解释。