• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于改进的棋盘上下文模型、可变形残差模块和知识蒸馏的快速高性能学习图像压缩

Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation.

作者信息

Fu Haisheng, Liang Feng, Liang Jie, Wang Yongqiang, Fang Zhenman, Zhang Guohe, Han Jingning

出版信息

IEEE Trans Image Process. 2024;33:4702-4715. doi: 10.1109/TIP.2024.3445737. Epub 2024 Aug 30.

DOI:10.1109/TIP.2024.3445737
PMID:39186412
Abstract

Deep learning-based image compression has made great progresses recently. However, some leading schemes use serial context-adaptive entropy model to improve the rate-distortion (R-D) performance, which is very slow. In addition, the complexities of the encoding and decoding networks are quite high and not suitable for many practical applications. In this paper, we propose four techniques to balance the trade-off between the complexity and performance. We first introduce the deformable residual module to remove more redundancies in the input image, thereby enhancing compression performance. Second, we design an improved checkerboard context model with two separate distribution parameter estimation networks and different probability models, which enables parallel decoding without sacrificing the performance compared to the sequential context-adaptive model. Third, we develop a three-pass knowledge distillation scheme to retrain the decoder and entropy coding, and reduce the complexity of the core decoder network, which transfers both the final and intermediate results of the teacher network to the student network to improve its performance. Fourth, we introduce L regularization to make the numerical values of the latent representation more sparse, and we only encode non-zero channels in the encoding and decoding process to reduce the bit rate. This also reduces the encoding and decoding time. Experiments show that compared to the state-of-the-art learned image coding scheme, our method can be about 20 times faster in encoding and 70-90 times faster in decoding, and our R-D performance is also 2.3% higher. Our method achieves better rate-distortion performance than classical image codecs including H.266/VVC-intra (4:4:4) and some recent learned methods, as measured by both PSNR and MS-SSIM metrics on the Kodak and Tecnick-40 datasets.

摘要

基于深度学习的图像压缩近年来取得了很大进展。然而,一些领先的方案使用串行上下文自适应熵模型来提高率失真(R-D)性能,这非常缓慢。此外,编码和解码网络的复杂度相当高,不适用于许多实际应用。在本文中,我们提出了四种技术来平衡复杂度和性能之间的权衡。我们首先引入可变形残差模块,以去除输入图像中的更多冗余,从而提高压缩性能。其次,我们设计了一种改进的棋盘上下文模型,它有两个独立的分布参数估计网络和不同的概率模型,与顺序上下文自适应模型相比,这使得并行解码在不牺牲性能的情况下成为可能。第三,我们开发了一种三通道知识蒸馏方案来重新训练解码器和熵编码,并降低核心解码器网络的复杂度,该方案将教师网络的最终结果和中间结果都传输到学生网络以提高其性能。第四,我们引入L正则化以使潜在表示的数值更稀疏,并且我们在编码和解码过程中只对非零通道进行编码以降低比特率。这也减少了编码和解码时间。实验表明,与当前最先进的学习图像编码方案相比,我们的方法在编码速度上可以快约20倍,在解码速度上可以快70 - 90倍,并且我们的R-D性能也高2.3%。在柯达和Tecnick-40数据集上,通过PSNR和MS-SSIM指标衡量,我们的方法比包括H.266/VVC-intra(4:4:4)在内的经典图像编解码器以及一些最近的学习方法实现了更好的率失真性能。

相似文献

1
Fast and High-Performance Learned Image Compression With Improved Checkerboard Context Model, Deformable Residual Module, and Knowledge Distillation.基于改进的棋盘上下文模型、可变形残差模块和知识蒸馏的快速高性能学习图像压缩
IEEE Trans Image Process. 2024;33:4702-4715. doi: 10.1109/TIP.2024.3445737. Epub 2024 Aug 30.
2
Learned Image Compression With Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules.基于高斯-拉普拉斯-逻辑混合模型和级联残差模块的学习图像压缩。
IEEE Trans Image Process. 2023;32:2063-2076. doi: 10.1109/TIP.2023.3263099.
3
Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression.用于图像压缩的高效且有效的基于上下文的卷积熵建模
IEEE Trans Image Process. 2020 Apr 14. doi: 10.1109/TIP.2020.2985225.
4
Syntax-Guided Content-Adaptive Transform for Image Compression.用于图像压缩的语法引导内容自适应变换
Sensors (Basel). 2024 Aug 22;24(16):5439. doi: 10.3390/s24165439.
5
Image Compression Based on Hybrid Domain Attention and Postprocessing Enhancement.基于混合域注意力和后处理增强的图像压缩。
Comput Intell Neurosci. 2022 Mar 17;2022:4926124. doi: 10.1155/2022/4926124. eCollection 2022.
6
Learning Context-Based Nonlocal Entropy Modeling for Image Compression.基于学习上下文的非局部熵图像压缩建模
IEEE Trans Neural Netw Learn Syst. 2023 Mar;34(3):1132-1145. doi: 10.1109/TNNLS.2021.3104974. Epub 2023 Feb 28.
7
Low Computational Coding-Efficient Distributed Video Coding: Adding a Decision Mode to Limit Channel Coding Load.低计算量编码高效分布式视频编码:添加决策模式以限制信道编码负载
Entropy (Basel). 2023 Jan 28;25(2):241. doi: 10.3390/e25020241.
8
End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling.通过非局部注意力优化和改进的上下文建模实现端到端学习的图像压缩
IEEE Trans Image Process. 2021;30:3179-3191. doi: 10.1109/TIP.2021.3058615. Epub 2021 Feb 25.
9
APT-Net: Adaptive encoding and parallel decoding transformer for medical image segmentation.APT-Net:用于医学图像分割的自适应编码和并行解码的转换器。
Comput Biol Med. 2022 Dec;151(Pt A):106292. doi: 10.1016/j.compbiomed.2022.106292. Epub 2022 Nov 11.
10
Ms RED: A novel multi-scale residual encoding and decoding network for skin lesion segmentation.雷德女士:一种用于皮肤病变分割的新型多尺度残差编码和解码网络。
Med Image Anal. 2022 Jan;75:102293. doi: 10.1016/j.media.2021.102293. Epub 2021 Nov 3.

引用本文的文献

1
Transformer Fault Diagnosis Based on Knowledge Distillation and Residual Convolutional Neural Networks.基于知识蒸馏和残差卷积神经网络的变压器故障诊断
Entropy (Basel). 2025 Jun 23;27(7):669. doi: 10.3390/e27070669.