College of Computer Science and Technology, Harbin Engineering University, Harbin, HeiLongJiang, China.
School of Aeronautics, Northwestern Polytechnical University, Xi'an, ShaanXi, China.
PLoS One. 2018 Apr 12;13(4):e0195114. doi: 10.1371/journal.pone.0195114. eCollection 2018.
Explicit structural inference is one key point to improve the accuracy of scene parsing. Meanwhile, adversarial training method is able to reinforce spatial contiguity in output segmentations. To take both advantages of the structural learning and adversarial training simultaneously, we propose a novel deep learning network architecture called Structural Inference Embedded Adversarial Networks (SIEANs) for pixel-wise scene labeling. The generator of our SIEANs, a novel designed scene parsing network, makes full use of convolutional neural networks and long short-term memory networks to learn the global contextual information of objects in four different directions from RGB-(D) images, which is able to describe the (three-dimensional) spatial distributions of objects in a more comprehensive and accurate way. To further improve the performance, we explore the adversarial training method to optimize the generator along with a discriminator, which can not only detect and correct higher-order inconsistencies between the predicted segmentations and corresponding ground truths, but also exploit full advantages of the generator by fine-tuning its parameters so as to obtain higher consistencies. The experimental results demonstrate that our proposed SIEANs is able to achieve a better performance on PASCAL VOC 2012, SIFT FLOW, PASCAL Person-Part, Cityscapes, Stanford Background, NYUDv2, and SUN-RGBD datasets compared to the most of state-of-the-art methods.
显式结构推断是提高场景解析精度的关键之一。同时,对抗训练方法能够增强输出分割中的空间连续性。为了同时利用结构学习和对抗训练的优势,我们提出了一种名为结构推理嵌入对抗网络(SIEANs)的新型深度学习网络架构,用于像素级场景标注。我们的 SIEANs 的生成器是一种新颖的场景解析网络,充分利用卷积神经网络和长短时记忆网络从 RGB-(D) 图像中学习四个不同方向的物体全局上下文信息,能够更全面、更准确地描述物体的(三维)空间分布。为了进一步提高性能,我们探索了对抗训练方法来优化生成器及其判别器,这不仅可以检测和纠正预测分割与相应地面实况之间的高阶不一致性,还可以通过微调生成器的参数充分利用其优势,以获得更高的一致性。实验结果表明,与最先进的方法相比,我们提出的 SIEANs 在 PASCAL VOC 2012、SIFT FLOW、PASCAL Person-Part、Cityscapes、Stanford Background、NYUDv2 和 SUN-RGBD 数据集上能够实现更好的性能。