Sun Kai, Zhang Jiangshe, Xu Shuang, Zhao Zixiang, Zhang Chunxia, Liu Junmin, Hu Junying
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4091-4102. doi: 10.1109/TNNLS.2023.3326606. Epub 2025 Feb 28.
Recently, view-based approaches, which recognize a 3D object through its projected 2-D images, have been extensively studied and have achieved considerable success in 3D object recognition. Nevertheless, most of them use a pooling operation to aggregate viewwise features, which usually leads to the visual information loss. To tackle this problem, we propose a novel layer called capsule attention layer (CAL) by using attention mechanism to fuse the features expressed by capsules. In detail, instead of dynamic routing algorithm, we use an attention module to transmit information from the lower level capsules to higher level capsules, which obviously improves the speed of capsule networks. In particular, the view pooling layer of multiview convolutional neural network (MVCNN) becomes a special case of our CAL when the trainable weights are chosen on some certain values. Furthermore, based on CAL, we propose a capsule attention convolutional neural network (CACNN) for 3D object recognition. Extensive experimental results on three benchmark datasets demonstrate the efficiency of our CACNN and show that it outperforms many state-of-the-art methods.
最近,基于视图的方法通过三维物体的二维投影图像来识别该三维物体,这种方法已得到广泛研究,并在三维物体识别方面取得了显著成功。然而,它们中的大多数使用池化操作来聚合视图特征,这通常会导致视觉信息丢失。为了解决这个问题,我们通过使用注意力机制来融合胶囊所表达的特征,提出了一种名为胶囊注意力层(CAL)的新型层。具体而言,我们使用注意力模块而非动态路由算法,将信息从低级胶囊传输到高级胶囊,这显著提高了胶囊网络的速度。特别地,当可训练权重被设置为某些特定值时,多视图卷积神经网络(MVCNN)的视图池化层就成为了我们的CAL的一种特殊情况。此外,基于CAL,我们提出了一种用于三维物体识别的胶囊注意力卷积神经网络(CACNN)。在三个基准数据集上的大量实验结果证明了我们的CACNN的有效性,并表明它优于许多当前的先进方法。