Li Mu, Zuo Wangmeng, Gu Shuhang, You Jane, Zhang David
IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3446-3461. doi: 10.1109/TPAMI.2020.2983926. Epub 2021 Sep 2.
Learning-based lossy image compression usually involves the joint optimization of rate-distortion performance, and requires to cope with the spatial variation of image content and contextual dependence among learned codes. Traditional entropy models can spatially adapt the local bit rate based on the image content, but usually are limited in exploiting context in code space. On the other hand, most deep context models are computationally very expensive and cannot efficiently perform decoding over the symbols in parallel. In this paper, we present a content-weighted encoder-decoder model, where the channel-wise multi-valued quantization is deployed for the discretization of the encoder features, and an importance map subnet is introduced to generate the importance masks for spatially varying code pruning. Consequently, the summation of importance masks can serve as an upper bound of the length of bitstream. Furthermore, the quantized representations of the learned code and importance map are still spatially dependent, which can be losslessly compressed using arithmetic coding. To compress the codes effectively and efficiently, we propose an upper-triangular masked convolutional network (triuMCN) for large context modeling. Experiments show that the proposed method can produce visually much better results, and performs favorably against deep and traditional lossy image compression approaches.
基于学习的有损图像压缩通常涉及率失真性能的联合优化,并且需要应对图像内容的空间变化以及学习到的代码之间的上下文依赖性。传统熵模型可以根据图像内容在空间上自适应局部比特率,但通常在利用代码空间中的上下文方面存在局限性。另一方面,大多数深度上下文模型计算成本非常高,并且不能有效地对符号进行并行解码。在本文中,我们提出了一种内容加权的编码器 - 解码器模型,其中针对编码器特征的离散化部署了通道级多值量化,并引入了一个重要性映射子网来生成用于空间变化代码剪枝的重要性掩码。因此,重要性掩码的总和可以用作比特流长度的上限。此外,学习到的代码和重要性映射的量化表示在空间上仍然相关,可以使用算术编码进行无损压缩。为了有效且高效地压缩代码,我们提出了一种用于大上下文建模的上三角掩码卷积网络(triuMCN)。实验表明,所提出的方法可以产生视觉上更好的结果,并且在与深度和传统有损图像压缩方法的比较中表现出色。