Suppr超能文献

基于有向无环图递归神经网络的场景分割。

Scene Segmentation with DAG-Recurrent Neural Networks.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2018 Jun;40(6):1480-1493. doi: 10.1109/TPAMI.2017.2712691. Epub 2017 Jun 6.

Abstract

In this paper, we address the challenging task of scene segmentation. In order to capture the rich contextual dependencies over image regions, we propose Directed Acyclic Graph-Recurrent Neural Networks (DAG-RNN) to perform context aggregation over locally connected feature maps. More specifically, DAG-RNN is placed on top of pre-trained CNN (feature extractor) to embed context into local features so that their representative capability can be enhanced. In comparison with plain CNN (as in Fully Convolutional Networks-FCN), DAG-RNN is empirically found to be significantly more effective at aggregating context. Therefore, DAG-RNN demonstrates noticeably performance superiority over FCNs on scene segmentation. Besides, DAG-RNN entails dramatically less parameters as well as demands fewer computation operations, which makes DAG-RNN more favorable to be potentially applied on resource-constrained embedded devices. Meanwhile, the class occurrence frequencies are extremely imbalanced in scene segmentation, so we propose a novel class-weighted loss to train the segmentation network. The loss distributes reasonably higher attention weights to infrequent classes during network training, which is essential to boost their parsing performance. We evaluate our segmentation network on three challenging public scene segmentation benchmarks: Sift Flow, Pascal Context and COCO Stuff. On top of them, we achieve very impressive segmentation performance.

摘要

在本文中,我们解决了场景分割这一具有挑战性的任务。为了捕获图像区域的丰富上下文依赖关系,我们提出了有向无环图递归神经网络(DAG-RNN)来对局部连接的特征图进行上下文聚合。具体来说,DAG-RNN 位于预训练的 CNN(特征提取器)之上,将上下文嵌入到局部特征中,从而增强其表示能力。与普通 CNN(如全卷积网络-FCN)相比,实验发现 DAG-RNN 在聚合上下文方面效果显著更好。因此,DAG-RNN 在场景分割方面明显优于 FCN。此外,DAG-RNN 需要的参数和计算操作明显更少,这使得 DAG-RNN 更有利于潜在地应用于资源受限的嵌入式设备。同时,场景分割中的类出现频率极其不平衡,因此我们提出了一种新的类加权损失来训练分割网络。该损失在网络训练期间合理地为罕见类分配更高的注意力权重,这对于提高它们的解析性能至关重要。我们在三个具有挑战性的公共场景分割基准上评估了我们的分割网络:SiftFlow、Pascal Context 和 COCO Stuff。在这些基准上,我们实现了非常令人印象深刻的分割性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验