Umar Saidu, Taherkhani Aboozar
School of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UK.
Sensors (Basel). 2024 Oct 5;24(19):6446. doi: 10.3390/s24196446.
The rapid growth in technologies for 3D sensors has made point cloud data increasingly available in different applications such as autonomous driving, robotics, and virtual and augmented reality. This raises a growing need for deep learning methods to process the data. Point clouds are difficult to be used directly as inputs in several deep learning techniques. The difficulty is raised by the unstructured and unordered nature of the point cloud data. So, machine learning models built for images or videos cannot be used directly on point cloud data. Although the research in the field of point clouds has gained high attention and different methods have been developed over the decade, very few research works directly with point cloud data, and most of them convert the point cloud data into 2D images or voxels by performing some pre-processing that causes information loss. Methods that directly work on point clouds are in the early stage and this affects the performance and accuracy of the models. Advanced techniques in classical convolutional neural networks, such as the attention mechanism, need to be transferred to the methods directly working with point clouds. In this research, an attention mechanism is proposed to be added to deep convolutional neural networks that process point clouds directly. The attention module was proposed based on specific pooling operations which are designed to be applied directly to point clouds to extract vital information from the point clouds. Segmentation of the ShapeNet dataset was performed to evaluate the method. The mean intersection over union (mIoU) score of the proposed framework was increased after applying the attention method compared to a base state-of-the-art framework that does not have the attention mechanism.
3D传感器技术的快速发展使得点云数据在自动驾驶、机器人技术以及虚拟现实和增强现实等不同应用中越来越容易获取。这就日益需要深度学习方法来处理这些数据。在几种深度学习技术中,点云难以直接用作输入。点云数据的非结构化和无序性增加了这种难度。因此,为图像或视频构建的机器学习模型不能直接用于点云数据。尽管点云领域的研究受到了高度关注,并且在过去十年中已经开发出了不同的方法,但很少有研究直接处理点云数据,而且大多数研究通过执行一些会导致信息丢失的预处理将点云数据转换为2D图像或体素。直接处理点云的方法尚处于早期阶段,这影响了模型的性能和准确性。经典卷积神经网络中的先进技术,如注意力机制,需要转移到直接处理点云的方法中。在本研究中,提出在直接处理点云的深度卷积神经网络中添加注意力机制。注意力模块是基于特定的池化操作提出的,这些操作旨在直接应用于点云以从点云中提取重要信息。对ShapeNet数据集进行了分割以评估该方法。与没有注意力机制的基础先进框架相比,应用注意力方法后,所提出框架的平均交并比(mIoU)得分有所提高。