Suppr超能文献

统一的 DeepLabV3+ 用于半暗图像语义分割。

Unified DeepLabV3+ for Semi-Dark Image Semantic Segmentation.

机构信息

High Performance Cloud Computing Center (HPC3), Department of Computer and Information Sciences, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Malaysia.

Department of Computer Science, Shaheed Zulfiqar Ali Bhutto Institute of Science and Technology, Karachi 75600, Pakistan.

出版信息

Sensors (Basel). 2022 Jul 15;22(14):5312. doi: 10.3390/s22145312.

Abstract

Semantic segmentation for accurate visual perception is a critical task in computer vision. In principle, the automatic classification of dynamic visual scenes using predefined object classes remains unresolved. The challenging problems of learning deep convolution neural networks, specifically ResNet-based DeepLabV3+ (the most recent version), are threefold. The problems arise due to (1) biased centric exploitations of filter masks, (2) lower representational power of residual networks due to identity shortcuts, and (3) a loss of spatial relationship by using per-pixel primitives. To solve these problems, we present a proficient approach based on DeepLabV3+, along with an added evaluation metric, namely, Unified DeepLabV3+ and S3core, respectively. The presented unified version reduced the effect of biased exploitations via additional dilated convolution layers with customized dilation rates. We further tackled the problem of representational power by introducing non-linear group normalization shortcuts to solve the focused problem of semi-dark images. Meanwhile, to keep track of the spatial relationships in terms of the global and local contexts, geometrically bunched pixel cues were used. We accumulated all the proposed variants of DeepLabV3+ to propose Unified DeepLabV3+ for accurate visual decisions. Finally, the proposed S3core evaluation metric was based on the weighted combination of three different accuracy measures, i.e., the pixel accuracy, IoU (intersection over union), and Mean BFScore, as robust identification criteria. Extensive experimental analysis performed over a CamVid dataset confirmed the applicability of the proposed solution for autonomous vehicles and robotics for outdoor settings. The experimental analysis showed that the proposed Unified DeepLabV3+ outperformed DeepLabV3+ by a margin of 3% in terms of the class-wise pixel accuracy, along with a higher S3core, depicting the effectiveness of the proposed approach.

摘要

语义分割是计算机视觉中一个重要的研究方向,旨在实现对动态视觉场景的精确感知。目前,利用预定义的目标类别来自动分类动态视觉场景仍然是一个具有挑战性的问题。基于深度卷积神经网络的学习,尤其是基于 ResNet 的 DeepLabV3+(最新版本),存在三个问题。这些问题源于:(1)滤波器掩模的中心偏见利用;(2)由于身份捷径,残差网络的表示能力较低;(3)使用像素级原语导致空间关系丢失。为了解决这些问题,我们提出了一种基于 DeepLabV3+的高效方法,并引入了一个附加的评估指标,即统一的 DeepLabV3+和 S3core。所提出的统一版本通过使用自定义膨胀率的额外扩张卷积层减少了偏置利用的影响。我们通过引入非线性组归一化捷径来解决半暗图像的焦点问题,从而进一步解决表示能力的问题。同时,为了跟踪全局和局部上下文的空间关系,使用了几何聚集的像素线索。我们将 DeepLabV3+的所有变体都集成在一起,提出了统一的 DeepLabV3+,以进行准确的视觉决策。最后,所提出的 S3core 评估指标是基于三种不同的精度度量,即像素精度、IoU(交并比)和平均 BFScore 的加权组合,作为鲁棒的识别标准。在 CamVid 数据集上进行的广泛实验分析证实了该解决方案在自主车辆和机器人领域的适用性,用于户外环境。实验分析表明,所提出的统一 DeepLabV3+在类别像素精度方面比 DeepLabV3+高出 3%,同时 S3core 也更高,这表明了所提出方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ae8/9324997/a8bb0cb348ee/sensors-22-05312-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验