揭示人类与机器编码的未来：端到端学习图像压缩综述

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression.

作者信息

Huang Chen-Hsiu, Wu Ja-Ling

机构信息

Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Taiwan.

出版信息

Entropy (Basel). 2024 Apr 24;26(5):357. doi: 10.3390/e26050357.

DOI:10.3390/e26050357

PMID:38785606

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11120525/

Abstract

End-to-end learned image compression codecs have notably emerged in recent years. These codecs have demonstrated superiority over conventional methods, showcasing remarkable flexibility and adaptability across diverse data domains while supporting new distortion losses. Despite challenges such as computational complexity, learned image compression methods inherently align with learning-based data processing and analytic pipelines due to their well-suited internal representations. The concept of Video Coding for Machines has garnered significant attention from both academic researchers and industry practitioners. This concept reflects the growing need to integrate data compression with computer vision applications. In light of these developments, we present a comprehensive survey and review of lossy image compression methods. Additionally, we provide a concise overview of two prominent international standards, MPEG Video Coding for Machines and JPEG AI. These standards are designed to bridge the gap between data compression and computer vision, catering to practical industry use cases.

摘要

近年来，端到端学习的图像压缩编解码器显著涌现。这些编解码器已证明优于传统方法，在支持新的失真损失的同时，在不同数据领域展现出显著的灵活性和适应性。尽管存在计算复杂度等挑战，但由于其内部表示非常合适，基于学习的图像压缩方法本质上与基于学习的数据处理和分析管道相契合。机器视频编码的概念已引起学术研究人员和行业从业者的广泛关注。这一概念反映了将数据压缩与计算机视觉应用集成的需求不断增长。鉴于这些发展，我们对有损图像压缩方法进行了全面的调查和综述。此外，我们简要概述了两个重要的国际标准，即机器的MPEG视频编码和JPEG AI。这些标准旨在弥合数据压缩与计算机视觉之间的差距，以满足实际行业用例的需求。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/092d/11120525/ecc351df44e2/entropy-26-00357-g001.jpg

相似文献

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression.

Entropy (Basel). 2024 Apr 24;26(5):357. doi: 10.3390/e26050357.

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics.

IEEE Trans Image Process. 2020 Aug 28;PP. doi: 10.1109/TIP.2020.3016485.

Learning End-to-End Lossy Image Compression: A Benchmark.

IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):4194-4211. doi: 10.1109/TPAMI.2021.3065339. Epub 2022 Jul 1.

Deep Learning Post-Filtering Using Multi-Head Attention and Multiresolution Feature Fusion for Image and Intra-Video Quality Enhancement.

Sensors (Basel). 2022 Feb 10;22(4):1353. doi: 10.3390/s22041353.

Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics.

IEEE Trans Pattern Anal Mach Intell. 2024 Jul;46(7):5174-5191. doi: 10.1109/TPAMI.2024.3367293. Epub 2024 Jun 5.

Enhanced Standard Compatible Image Compression Framework Based on Auxiliary Codec Networks.

IEEE Trans Image Process. 2022;31:664-677. doi: 10.1109/TIP.2021.3134473. Epub 2021 Dec 28.

End-to-End Optimized 360° Image Compression.

IEEE Trans Image Process. 2022;31:6267-6281. doi: 10.1109/TIP.2022.3208429. Epub 2022 Sep 30.

Scalable Face Image Coding via StyleGAN Prior: Toward Compression for Human-Machine Collaborative Vision.

IEEE Trans Image Process. 2024;33:408-422. doi: 10.1109/TIP.2023.3343912. Epub 2023 Dec 29.

A joint source-channel distortion model for JPEG compressed images.

IEEE Trans Image Process. 2006 Jun;15(6):1349-64. doi: 10.1109/tip.2006.871118.

Robust data hiding for JPEG images with invertible neural network.

Neural Netw. 2023 Jun;163:219-232. doi: 10.1016/j.neunet.2023.03.037. Epub 2023 Mar 31.

本文引用的文献

Unified Architecture Adaptation for Compressed Domain Semantic Inference.

IEEE Trans Circuits Syst Video Technol. 2023 Aug;33(8):4108-4121. doi: 10.1109/tcsvt.2023.3240391. Epub 2023 Jan 30.

Learned Image Compression With Gaussian-Laplacian-Logistic Mixture Model and Concatenated Residual Modules.

IEEE Trans Image Process. 2023;32:2063-2076. doi: 10.1109/TIP.2023.3263099.

Scalable Image Coding for Humans and Machines.

IEEE Trans Image Process. 2022;31:2739-2754. doi: 10.1109/TIP.2022.3160602. Epub 2022 Mar 29.

SSSIC: Semantics-to-Signal Scalable Image Coding With Learned Structural Representations.

IEEE Trans Image Process. 2021;30:8939-8954. doi: 10.1109/TIP.2021.3121131. Epub 2021 Oct 29.

Learning End-to-End Lossy Image Compression: A Benchmark.

IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):4194-4211. doi: 10.1109/TPAMI.2021.3065339. Epub 2022 Jul 1.

End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling.

IEEE Trans Image Process. 2021;30:3179-3191. doi: 10.1109/TIP.2021.3058615. Epub 2021 Feb 25.

Image Quality Assessment: Unifying Structure and Texture Similarity.

IEEE Trans Pattern Anal Mach Intell. 2022 May;44(5):2567-2581. doi: 10.1109/TPAMI.2020.3045810. Epub 2022 Apr 1.

End-to-End Optimized Versatile Image Compression With Wavelet-Like Transform.

IEEE Trans Pattern Anal Mach Intell. 2022 Mar;44(3):1247-1263. doi: 10.1109/TPAMI.2020.3026003. Epub 2022 Feb 3.

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics.

IEEE Trans Image Process. 2020 Aug 28;PP. doi: 10.1109/TIP.2020.3016485.

A Joint Compression Scheme of Video Feature Descriptors and Visual Content.

IEEE Trans Image Process. 2017 Feb;26(2):633-647. doi: 10.1109/TIP.2016.2629447. Epub 2016 Nov 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

揭示人类与机器编码的未来：端到端学习图像压缩综述

Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献