Zhang Chongzhen, Tang Yang, Zhao Chaoqiang, Sun Qiyu, Ye Zhencheng, Kurths Jurgen
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5404-5415. doi: 10.1109/TNNLS.2021.3072883. Epub 2021 Nov 30.
Semantic segmentation and depth completion are two challenging tasks in scene understanding, and they are widely used in robotics and autonomous driving. Although several studies have been proposed to jointly train these two tasks using some small modifications, such as changing the last layer, the result of one task is not utilized to improve the performance of the other one despite that there are some similarities between these two tasks. In this article, we propose multitask generative adversarial networks (Multitask GANs), which are not only competent in semantic segmentation and depth completion but also improve the accuracy of depth completion through generated semantic images. In addition, we improve the details of generated semantic images based on CycleGAN by introducing multiscale spatial pooling blocks and the structural similarity reconstruction loss. Furthermore, considering the inner consistency between semantic and geometric structures, we develop a semantic-guided smoothness loss to improve depth completion results. Extensive experiments on the Cityscapes data set and the KITTI depth completion benchmark show that the Multitask GANs are capable of achieving competitive performance for both semantic segmentation and depth completion tasks.
语义分割和深度补全是场景理解中的两项具有挑战性的任务,它们在机器人技术和自动驾驶中得到了广泛应用。尽管已经有一些研究提出通过一些小的修改来联合训练这两项任务,比如改变最后一层,但尽管这两项任务之间存在一些相似性,一项任务的结果却没有被用于提高另一项任务的性能。在本文中,我们提出了多任务生成对抗网络(Multitask GANs),它不仅在语义分割和深度补全方面表现出色,还通过生成的语义图像提高了深度补全的准确性。此外,我们通过引入多尺度空间池化块和结构相似性重建损失,基于CycleGAN改进了生成语义图像的细节。此外,考虑到语义和几何结构之间的内在一致性,我们开发了一种语义引导的平滑损失来改善深度补全结果。在Cityscapes数据集和KITTI深度补全基准上进行的大量实验表明,多任务生成对抗网络在语义分割和深度补全任务中都能够取得有竞争力的性能。