Beijing Institute of System Engineering, Beijing 100101, China.
Artificial Intelligence Institute of China Electronics Technology Group Corporation, Beijing 100041, China.
Sensors (Basel). 2023 Apr 20;23(8):4123. doi: 10.3390/s23084123.
Three-dimensional point cloud registration, which aims to find the transformation that best aligns two point clouds, is a widely studied problem in computer vision with a wide spectrum of applications, such as underground mining. Many learning-based approaches have been developed and have demonstrated their effectiveness for point cloud registration. Particularly, attention-based models have achieved outstanding performance due to the extra contextual information captured by attention mechanisms. To avoid the high computation cost brought by attention mechanisms, an encoder-decoder framework is often employed to hierarchically extract the features where the attention module is only applied in the middle. This leads to the compromised effectiveness of the attention module. To tackle this issue, we propose a novel model with the attention layers embedded in both the encoder and decoder stages. In our model, the self-attentional layers are applied in the encoder to consider the relationship between points inside each point cloud, while the decoder utilizes cross-attentional layers to enrich features with contextual information. Extensive experiments conducted on public datasets prove that our model is able to achieve quality results on a registration task.
三维点云配准旨在找到最佳对齐两个点云的变换,是计算机视觉中一个广泛研究的问题,有广泛的应用,如地下采矿。已经开发了许多基于学习的方法,并证明了它们对点云配准的有效性。特别是,基于注意力的模型由于注意力机制捕获的额外上下文信息而取得了出色的性能。为了避免注意力机制带来的高计算成本,通常采用编码器-解码器框架来分层提取特征,其中注意力模块仅应用于中间。这导致注意力模块的效果受到影响。为了解决这个问题,我们提出了一种新的模型,其注意力层嵌入在编码器和解码器阶段中。在我们的模型中,自注意力层应用于编码器中以考虑每个点云中点之间的关系,而解码器利用交叉注意力层来丰富具有上下文信息的特征。在公共数据集上进行的广泛实验证明,我们的模型能够在配准任务中取得高质量的结果。