• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SSSIC:基于学习的结构表示的语义到信号可扩展图像编码。

SSSIC: Semantics-to-Signal Scalable Image Coding With Learned Structural Representations.

出版信息

IEEE Trans Image Process. 2021;30:8939-8954. doi: 10.1109/TIP.2021.3121131. Epub 2021 Oct 29.

DOI:10.1109/TIP.2021.3121131
PMID:34699359
Abstract

We address the requirement of image coding for joint human-machine vision, i.e., the decoded image serves both human observation and machine analysis/understanding. Previously, human vision and machine vision have been extensively studied by image (signal) compression and (image) feature compression, respectively. Recently, for joint human-machine vision, several studies have been devoted to joint compression of images and features, but the correlation between images and features is still unclear. We identify the deep network as a powerful toolkit for generating structural image representations. From the perspective of information theory, the deep features of an image naturally form an entropy decreasing series: a scalable bitstream is achieved by compressing the features backward from a deeper layer to a shallower layer until culminating with the image signal. Moreover, we can obtain learned representations by training the deep network for a given semantic analysis task or multiple tasks and acquire deep features that are related to semantics. With the learned structural representations, we propose SSSIC, a framework to obtain an embedded bitstream that can be either partially decoded for semantic analysis or fully decoded for human vision. We implement an exemplar SSSIC scheme using coarse-to-fine image classification as the driven semantic analysis task. We also extend the scheme for object detection and instance segmentation tasks. The experimental results demonstrate the effectiveness of the proposed SSSIC framework and establish that the exemplar scheme achieves higher compression efficiency than separate compression of images and features.

摘要

我们解决了联合人机视觉的图像编码要求,即解码后的图像既服务于人类观察,也服务于机器分析/理解。以前,人类视觉和机器视觉分别通过图像(信号)压缩和(图像)特征压缩得到了广泛的研究。最近,针对联合人机视觉,已经有几项研究致力于图像和特征的联合压缩,但图像和特征之间的相关性仍不清楚。我们将深度网络识别为生成结构图像表示的强大工具包。从信息论的角度来看,图像的深度特征自然形成一个熵递减序列:通过从较深的层到较浅的层压缩特征,直到最终达到图像信号,就可以实现可扩展的比特流。此外,我们可以通过为给定的语义分析任务或多个任务训练深度网络来获得学习表示,并获得与语义相关的深度特征。利用学习到的结构表示,我们提出了 SSSIC,这是一种获得嵌入式比特流的框架,该比特流可以部分解码用于语义分析,也可以完全解码用于人类视觉。我们使用从粗到细的图像分类作为驱动的语义分析任务来实现一个示例 SSSIC 方案。我们还扩展了该方案以用于对象检测和实例分割任务。实验结果证明了所提出的 SSSIC 框架的有效性,并表明示例方案比图像和特征的单独压缩实现了更高的压缩效率。

相似文献

1
SSSIC: Semantics-to-Signal Scalable Image Coding With Learned Structural Representations.SSSIC:基于学习的结构表示的语义到信号可扩展图像编码。
IEEE Trans Image Process. 2021;30:8939-8954. doi: 10.1109/TIP.2021.3121131. Epub 2021 Oct 29.
2
Scalable Face Image Coding via StyleGAN Prior: Toward Compression for Human-Machine Collaborative Vision.基于 StyleGAN 先验的可扩展人脸图像编码:面向人机协同视觉的压缩。
IEEE Trans Image Process. 2024;33:408-422. doi: 10.1109/TIP.2023.3343912. Epub 2023 Dec 29.
3
Image Semantic Recognition and Segmentation Algorithm of Colorimetric Sensor Array Based on Deep Convolutional Neural Network.基于深度卷积神经网络的比色传感器阵列图像语义识别与分割算法。
Comput Intell Neurosci. 2022 Sep 30;2022:2439371. doi: 10.1155/2022/2439371. eCollection 2022.
4
A Plug-in Method for Representation Factorization in Connectionist Models.一种连接主义模型中表示因子分解的插件方法。
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3792-3803. doi: 10.1109/TNNLS.2021.3054480. Epub 2022 Aug 3.
5
Modality independent adversarial network for generalized zero shot image classification.模态无关对抗网络的广义零样本图像分类。
Neural Netw. 2021 Feb;134:11-22. doi: 10.1016/j.neunet.2020.11.007. Epub 2020 Nov 21.
6
ACSL: Adaptive correlation-driven sparsity learning for deep neural network compression.ACSL:用于深度神经网络压缩的自适应相关驱动稀疏学习。
Neural Netw. 2021 Dec;144:465-477. doi: 10.1016/j.neunet.2021.09.012. Epub 2021 Sep 16.
7
VNVC: A Versatile Neural Video Coding Framework for Efficient Human-Machine Vision.VNVC:一种用于高效人机视觉的通用神经视频编码框架。
IEEE Trans Pattern Anal Mach Intell. 2024 Jul;46(7):4579-4596. doi: 10.1109/TPAMI.2024.3356548. Epub 2024 Jun 5.
8
Video Coding for Machines: Compact Visual Representation Compression for Intelligent Collaborative Analytics.面向机器的视频编码:用于智能协作分析的紧凑视觉表示压缩
IEEE Trans Pattern Anal Mach Intell. 2024 Jul;46(7):5174-5191. doi: 10.1109/TPAMI.2024.3367293. Epub 2024 Jun 5.
9
JPD-SE: High-Level Semantics for Joint Perception-Distortion Enhancement in Image Compression.JPD-SE:图像压缩中联合感知-失真增强的高级语义
IEEE Trans Image Process. 2022;31:4405-4416. doi: 10.1109/TIP.2022.3180208. Epub 2022 Jul 1.
10
Semantic redundancy-aware implicit neural compression for multidimensional biomedical image data.语义冗余感知的多维生物医学图像数据隐式神经压缩。
Commun Biol. 2024 Sep 3;7(1):1081. doi: 10.1038/s42003-024-06788-0.

引用本文的文献

1
Unveiling the Future of Human and Machine Coding: A Survey of End-to-End Learned Image Compression.揭示人类与机器编码的未来:端到端学习图像压缩综述
Entropy (Basel). 2024 Apr 24;26(5):357. doi: 10.3390/e26050357.