Wang Le, Liu Hongzhen, Zhou Sanping, Tang Wei, Hua Gang
IEEE Trans Image Process. 2023;32:764-778. doi: 10.1109/TIP.2022.3226414. Epub 2023 Jan 18.
Video panoptic segmentation is an important but challenging task in computer vision. It not only performs panoptic segmentation of each frame, but also associates the same instance across adjacent frames. Due to the lack of temporal coherence modeling, most existing approaches often generate identity switches during instance association, and they cannot handle ambiguous segmentation boundaries caused by motion blur. To address these difficult issues, we introduce a simple yet effective Instance Motion Tendency Network (IMTNet) for video panoptic segmentation. It learns a global motion tendency map for instance association, and a hierarchical classifier for motion boundary refinement. Specifically, a Global Motion Tendency Module (GMTM) is designed to learn robust motion features from optical flows, which can directly associate each instance in the previous frame to the corresponding instance in the current frame. In addition, we propose a Motion Boundary Refinement Module (MBRM) to learn a hierarchical classifier to handle the boundary pixels of moving targets, which can effectively revise the inaccurate segmentation predictions. Experimental results on both Cityscapes and Cityscapes-VPS datasets show that our IMTNet outperforms most state-of-the-art approaches.
视频全景分割是计算机视觉中一项重要但具有挑战性的任务。它不仅要对每一帧进行全景分割,还要在相邻帧之间关联相同的实例。由于缺乏时间一致性建模,大多数现有方法在实例关联过程中经常会产生身份切换,并且无法处理由运动模糊导致的模糊分割边界。为了解决这些难题,我们引入了一种简单而有效的用于视频全景分割的实例运动趋势网络(IMTNet)。它学习用于实例关联的全局运动趋势图以及用于运动边界细化的分层分类器。具体来说,设计了一个全局运动趋势模块(GMTM)来从光流中学习鲁棒的运动特征,该特征可以直接将前一帧中的每个实例与当前帧中的相应实例关联起来。此外,我们提出了一个运动边界细化模块(MBRM)来学习一个分层分类器以处理移动目标的边界像素,这可以有效地修正不准确的分割预测。在Cityscapes和Cityscapes-VPS数据集上进行的实验结果表明,我们的IMTNet优于大多数最先进的方法。