Shuai Hui, Xu Xiang, Liu Qingshan
IEEE Trans Image Process. 2021;30:4973-4984. doi: 10.1109/TIP.2021.3073660. Epub 2021 May 14.
In this paper, a Backward Attentive Fusing Network with Local Aggregation Classifier (BAF-LAC) is proposed to improve the performance of 3D point cloud semantic segmentation. It consists of a Backward Attentive Fusing Encoder-Decoder (BAF-ED) to learn semantic features and a Local Aggregation Classifier (LAC) to maintain the context-awareness of points. BAF-ED narrows the semantic gap between the encoder and the decoder via fusing multi-layer encoder features with the decoder features. High-level encoder features are transformed into an attention map to modulate low-level encoder features backward. LAC adaptively enhances the intermediate features in point-wise MLPs via aggregating the features of neighboring points into the center point. It takes the place of commonly used post-processing techniques and retains context consistency into the classifier. Equipped with these modules, BAF-LAC can extract discriminative semantic features and predict smoother results. Extensive experiments on Semantic3D, SemanticKITTI, and S3DIS demonstrate that the proposed method can achieve competitive results against the state-of-the-art methods.