• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

任务驱动的语义编码通过强化学习。

Task-Driven Semantic Coding via Reinforcement Learning.

出版信息

IEEE Trans Image Process. 2021;30:6307-6320. doi: 10.1109/TIP.2021.3091909. Epub 2021 Jul 13.

DOI:10.1109/TIP.2021.3091909
PMID:34214035
Abstract

Task-driven semantic video/image coding has drawn considerable attention with the development of intelligent media applications, such as license plate detection, face detection, and medical diagnosis, which focuses on maintaining the semantic information of videos/images. Deep neural network (DNN)-based codecs have been studied for this purpose due to their inherent end-to-end optimization mechanism. However, the traditional hybrid coding framework cannot be optimized in an end-to-end manner, which makes task-driven semantic fidelity metric unable to be automatically integrated into the rate-distortion optimization process. Therefore, it is still attractive and challenging to implement task-driven semantic coding with the traditional hybrid coding framework, which should still be widely used in practical industry for a long time. To solve this challenge, we design semantic maps for different tasks to extract the pixelwise semantic fidelity for videos/images. Instead of directly integrating the semantic fidelity metric into traditional hybrid coding framework, we implement task-driven semantic coding by implementing semantic bit allocation based on reinforcement learning (RL). We formulate the semantic bit allocation problem as a Markov decision process (MDP) and utilize one RL agent to automatically determine the quantization parameters (QPs) for different coding units (CUs) according to the task-driven semantic fidelity metric. Extensive experiments on different tasks, such as classification, detection and segmentation, have demonstrated the superior performance of our approach by achieving an average bitrate saving of 34.39% to 52.62% over the High Efficiency Video Coding (H.265/HEVC) anchor under equivalent task-related semantic fidelity.

摘要

任务驱动的语义视频/图像编码在智能媒体应用的发展中引起了广泛关注,例如车牌检测、人脸识别和医疗诊断等,这些应用都侧重于保持视频/图像的语义信息。由于具有固有的端到端优化机制,基于深度神经网络(DNN)的编解码器已经被用于这一目的的研究。然而,传统的混合编码框架无法进行端到端优化,这使得任务驱动的语义保真度度量无法自动集成到率失真优化过程中。因此,使用传统的混合编码框架实现任务驱动的语义编码仍然具有吸引力和挑战性,这种方法在很长一段时间内仍将广泛应用于实际行业。为了解决这个挑战,我们为不同的任务设计语义图,以提取视频/图像的像素级语义保真度。我们不是直接将语义保真度度量直接集成到传统的混合编码框架中,而是通过基于强化学习(RL)的语义比特分配来实现任务驱动的语义编码。我们将语义比特分配问题表述为一个马尔可夫决策过程(MDP),并利用一个 RL 代理根据任务驱动的语义保真度度量自动确定不同编码单元(CU)的量化参数(QP)。在不同的任务(如分类、检测和分割)上进行的广泛实验表明,我们的方法在保持等效任务相关语义保真度的情况下,相对于高效率视频编码(H.265/HEVC)基准,平均节省了 34.39%到 52.62%的比特率。

相似文献

1
Task-Driven Semantic Coding via Reinforcement Learning.任务驱动的语义编码通过强化学习。
IEEE Trans Image Process. 2021;30:6307-6320. doi: 10.1109/TIP.2021.3091909. Epub 2021 Jul 13.
2
High Efficiency Video Coding (HEVC)-Based Surgical Telementoring System Using Shallow Convolutional Neural Network.基于高效视频编码 (HEVC) 的浅层卷积神经网络手术远程指导系统。
J Digit Imaging. 2019 Dec;32(6):1027-1043. doi: 10.1007/s10278-019-00206-2.
3
A Coding Framework and Benchmark Towards Low-Bitrate Video Understanding.一种面向低比特率视频理解的编码框架与基准
IEEE Trans Pattern Anal Mach Intell. 2024 Aug;46(8):5852-5872. doi: 10.1109/TPAMI.2024.3367879. Epub 2024 Jul 2.
4
Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling.基于背景建模的监控和会议视频的 HEVC 分层预测和编码优化。
IEEE Trans Image Process. 2014 Oct;23(10):4511-26. doi: 10.1109/TIP.2014.2352036. Epub 2014 Aug 26.
5
Adaptive Quantization Parameter Cascading in HEVC Hierarchical Coding.HEVC 分层编码中的自适应量化参数级联。
IEEE Trans Image Process. 2016 Jul;25(7):2997-3009. doi: 10.1109/TIP.2016.2556941. Epub 2016 Apr 20.
6
Learning-Based Rate Control for High Efficiency Video Coding.基于学习的高效视频编码码率控制。
Sensors (Basel). 2023 Mar 30;23(7):3607. doi: 10.3390/s23073607.
7
Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization.基于率失真优化的帧率上转换与 HEVC 编码的新融合。
IEEE Trans Image Process. 2018 Feb;27(2):678-691. doi: 10.1109/TIP.2017.2767782.
8
VMAF Oriented Perceptual Coding Based on Piecewise Metric Coupling.
IEEE Trans Image Process. 2021;30:5109-5121. doi: 10.1109/TIP.2021.3078622. Epub 2021 May 20.
9
Data Quality-Aware Mixed-Precision Quantization via Hybrid Reinforcement Learning.通过混合强化学习实现数据质量感知的混合精度量化
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9018-9031. doi: 10.1109/TNNLS.2024.3409692. Epub 2025 May 2.
10
Rate-Performance-Loss Optimization for Inter-Frame Deep Feature Coding From Videos.从视频中进行帧间深度特征编码的率-性能-损失优化。
IEEE Trans Image Process. 2017 Dec;26(12):5743-5757. doi: 10.1109/TIP.2017.2745203. Epub 2017 Aug 25.

引用本文的文献

1
A novel image semantic communication method via dynamic decision generation network and generative adversarial network.一种基于动态决策生成网络和生成对抗网络的新型图像语义通信方法。
Sci Rep. 2024 Aug 23;14(1):19636. doi: 10.1038/s41598-024-70619-9.
2
Promoting fast MR imaging pipeline by full-stack AI.通过全栈人工智能促进快速磁共振成像流程
iScience. 2023 Dec 2;27(1):108608. doi: 10.1016/j.isci.2023.108608. eCollection 2024 Jan 19.