Suppr超能文献

用于视觉识别的深度高分辨率表征学习

Deep High-Resolution Representation Learning for Visual Recognition.

作者信息

Wang Jingdong, Sun Ke, Cheng Tianheng, Jiang Borui, Deng Chaorui, Zhao Yang, Liu Dong, Mu Yadong, Tan Mingkui, Wang Xinggang, Liu Wenyu, Xiao Bin

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686. Epub 2021 Sep 2.

Abstract

High-resolution representations are essential for position-sensitive vision problems, such as human pose estimation, semantic segmentation, and object detection. Existing state-of-the-art frameworks first encode the input image as a low-resolution representation through a subnetwork that is formed by connecting high-to-low resolution convolutions in series (e.g., ResNet, VGGNet), and then recover the high-resolution representation from the encoded low-resolution representation. Instead, our proposed network, named as High-Resolution Network (HRNet), maintains high-resolution representations through the whole process. There are two key characteristics: (i) Connect the high-to-low resolution convolution streams in parallel and (ii) repeatedly exchange the information across resolutions. The benefit is that the resulting representation is semantically richer and spatially more precise. We show the superiority of the proposed HRNet in a wide range of applications, including human pose estimation, semantic segmentation, and object detection, suggesting that the HRNet is a stronger backbone for computer vision problems. All the codes are available at https://github.com/HRNet.

摘要

高分辨率表示对于诸如人体姿态估计、语义分割和目标检测等位置敏感的视觉问题至关重要。现有的最先进框架首先通过一个由串联连接高分辨率到低分辨率卷积组成的子网(例如ResNet、VGGNet)将输入图像编码为低分辨率表示,然后从编码后的低分辨率表示中恢复高分辨率表示。相反,我们提出的名为高分辨率网络(HRNet)的网络在整个过程中都保持高分辨率表示。它有两个关键特性:(i)并行连接高分辨率到低分辨率的卷积流,以及(ii)跨分辨率反复交换信息。其好处是得到的表示在语义上更丰富,在空间上更精确。我们在包括人体姿态估计、语义分割和目标检测在内的广泛应用中展示了所提出的HRNet的优越性,这表明HRNet是解决计算机视觉问题的更强有力的主干。所有代码可在https://github.com/HRNet获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验