Suppr超能文献

基于全局引导的选择性上下文网络的场景解析。

Global-Guided Selective Context Network for Scene Parsing.

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1752-1764. doi: 10.1109/TNNLS.2020.3043808. Epub 2022 Apr 4.

Abstract

Recent studies on semantic segmentation are exploiting contextual information to address the problem of inconsistent parsing prediction in big objects and ignorance in small objects. However, they utilize multilevel contextual information equally across pixels, overlooking those different pixels may demand different levels of context. Motivated by the above-mentioned intuition, we propose a novel global-guided selective context network (GSCNet) to adaptively select contextual information for improving scene parsing. Specifically, we introduce two global-guided modules, called global-guided global module (GGM) and global-guided local module (GLM), to, respectively, select global context (GC) and local context (LC) for pixels. When given an input feature map, GGM jointly employs the input feature map and its globally pooled feature to learn its global contextual demand based on which per-pixel GC is selected. While GLM adopts low-level feature from the adjacent stage as LC and synthetically models the input feature map, its globally pooled feature and LC to generate local contextual demand, based on which per-pixel LC is selected. Furthermore, we combine these two modules as a selective context block and import such SCBs in different levels of the network to propagate contextual information in a coarse-to-fine manner. Finally, we conduct extensive experiments to verify the effectiveness of our proposed model and achieve state-of-the-art performance on four challenging scene parsing data sets, i.e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff. Especially, GSCNet-101 obtains 82.6% on Cityscapes test set without using coarse data and 56.22% on ADE20K test set.

摘要

最近的语义分割研究利用上下文信息来解决大物体解析预测不一致和小物体被忽略的问题。然而,它们在像素级别上平等地利用多层次的上下文信息,而忽略了不同的像素可能需要不同级别的上下文。受上述直觉的启发,我们提出了一种新颖的全局引导选择性上下文网络(GSCNet),以自适应地选择上下文信息,从而提高场景解析。具体来说,我们引入了两个全局引导模块,称为全局引导全局模块(GGM)和全局引导局部模块(GLM),分别用于为像素选择全局上下文(GC)和局部上下文(LC)。当给定一个输入特征图时,GGM 联合使用输入特征图及其全局池化特征,根据该特征图学习其全局上下文需求,从而选择每个像素的 GC。而 GLM 采用来自相邻阶段的低层次特征作为 LC,并综合建模输入特征图,其全局池化特征和 LC 用于生成局部上下文需求,从而选择每个像素的 LC。此外,我们将这两个模块组合成一个选择性上下文块,并在网络的不同级别中引入这些 SCB,以粗到细的方式传播上下文信息。最后,我们进行了广泛的实验来验证我们提出的模型的有效性,并在四个具有挑战性的场景解析数据集上取得了最先进的性能,即 Cityscapes、ADE20K、PASCAL Context 和 COCO Stuff。特别是,GSCNet-101 在不使用粗数据的情况下在 Cityscapes 测试集上获得了 82.6%的准确率,在 ADE20K 测试集上获得了 56.22%的准确率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验