Li Junxuan, Zhang Yuanfang, Han Jiayi, Han Peng, Luo Kaiqing
Guangdong Provincial Engineering Research Center for Optoelectronic Instrument, School of Electronic Science and Engineering (School of Microelectronics), South China Normal University, Foshan 528225, China.
School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China.
Sensors (Basel). 2024 Dec 2;24(23):7702. doi: 10.3390/s24237702.
Vehicle-to-vehicle communication enables capturing sensor information from diverse perspectives, greatly aiding in semantic scene completion in autonomous driving. However, the misalignment of features between ego vehicle and cooperative vehicles leads to ambiguity problems, affecting accuracy and semantic information. In this paper, we propose a Two-Stream Multi-Vehicle collaboration approach (TSMV), which divides the features of collaborative vehicles into two streams and regresses interactively. To overcome the problems caused by feature misalignment, the Neighborhood Self-Cross Attention Transformer (NSCAT) module is designed to enable the ego vehicle to query the most similar local features from collaborative vehicles through cross-attention, rather than assuming spatial-temporal synchronization. A 3D occupancy map is finally generated from the features of collaborative vehicle aggregation. Experimental results on both V2VSSC and SemanticOPV2V datasets demonstrate TSMV outpace state-of-the-art collaborative semantic scene completion techniques.
车对车通信能够从不同角度获取传感器信息,极大地有助于自动驾驶中的语义场景补全。然而,自车与协作车辆之间的特征不对准会导致模糊问题,影响准确性和语义信息。在本文中,我们提出了一种双流多车辆协作方法(TSMV),该方法将协作车辆的特征分为两个流并进行交互式回归。为了克服特征不对准引起的问题,设计了邻域自交叉注意力变换器(NSCAT)模块,使自车能够通过交叉注意力从协作车辆中查询最相似的局部特征,而不是假设时空同步。最终从协作车辆聚合的特征中生成3D占用地图。在V2VSSC和SemanticOPV2V数据集上的实验结果表明,TSMV的性能超过了当前最先进的协作语义场景补全技术。