• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度QTMT:一种基于深度学习的用于帧内模式VVC的快速基于QTMT的CU划分方法。

DeepQTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVC.

作者信息

Li Tianyi, Xu Mai, Tang Runzhi, Chen Ying, Xing Qunliang

出版信息

IEEE Trans Image Process. 2021;30:5377-5390. doi: 10.1109/TIP.2021.3083447. Epub 2021 Jun 3.

DOI:10.1109/TIP.2021.3083447
PMID:34057892
Abstract

Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its predecessor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of the coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization. Instead of the brute-force QTMT search, this paper proposes a deep learning approach to predict the QTMT-based CU partition, for drastically accelerating the encoding process of intra-mode VVC. First, we establish a large-scale database containing sufficient CU partition patterns with diverse video content, which can facilitate the data-driven VVC complexity reduction. Next, we propose a multi-stage exit CNN (MSE-CNN) model with an early-exit mechanism to determine the CU partition, in accord with the flexible QTMT structure at multiple stages. Then, we design an adaptive loss function for training the MSE-CNN model, synthesizing both the uncertain number of split modes and the target on minimized RD cost. Finally, a multi-threshold decision scheme is developed, achieving a desirable trade-off between complexity and RD performance. The experimental results demonstrate that our approach can reduce the encoding time of VVC by 44.65%66.88% with a negligible Bjøntegaard delta bit-rate (BD-BR) of 1.322%3.188%, significantly outperforming other state-of-the-art approaches.

摘要

通用视频编码(VVC)作为最新标准,相较于其前身标准高效视频编码(HEVC)显著提高了编码效率,但代价是复杂度急剧增加。在VVC中,编码单元(CU)划分的四叉树加多种类型树(QTMT)结构占编码时间的97%以上,这是由于对递归率失真(RD)优化进行强力搜索所致。本文提出一种深度学习方法来预测基于QTMT的CU划分,而非强力的QTMT搜索,以大幅加速帧内模式VVC的编码过程。首先,我们建立一个包含具有多样视频内容的足够CU划分模式的大规模数据库,这有助于基于数据驱动降低VVC的复杂度。接下来,我们提出一种具有早期退出机制的多阶段退出卷积神经网络(MSE-CNN)模型,以根据多阶段灵活的QTMT结构确定CU划分。然后,我们设计一种自适应损失函数来训练MSE-CNN模型,综合考虑分裂模式数量的不确定性和最小化RD成本的目标。最后,开发一种多阈值决策方案,在复杂度和RD性能之间实现理想的权衡。实验结果表明,我们的方法可将VVC的编码时间减少44.65%至66.88%,同时具有可忽略不计的Bjøntegaard比特率增量(BD-BR),为1.322%至3.188%,显著优于其他现有最先进方法。

相似文献

1
DeepQTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVC.深度QTMT:一种基于深度学习的用于帧内模式VVC的快速基于QTMT的CU划分方法。
IEEE Trans Image Process. 2021;30:5377-5390. doi: 10.1109/TIP.2021.3083447. Epub 2021 Jun 3.
2
Reducing Complexity of HEVC: A Deep Learning Approach.降低高效视频编码(HEVC)的复杂度:一种深度学习方法。
IEEE Trans Image Process. 2018 Jun 13. doi: 10.1109/TIP.2018.2847035.
3
Temporal Prediction Model-Based Fast Inter CU Partition for Versatile Video Coding.基于时域预测模型的灵活视频编码快速交叉 CU 分区。
Sensors (Basel). 2022 Oct 12;22(20):7741. doi: 10.3390/s22207741.
4
Decision tree accelerated CTU partition algorithm for intra prediction in versatile video coding.决策树加速 CTU 分区算法在通用视频编码中的帧内预测。
PLoS One. 2021 Nov 8;16(11):e0258890. doi: 10.1371/journal.pone.0258890. eCollection 2021.
5
A Fast Algorithm for Intra-Frame Versatile Video Coding Based on Edge Features.一种基于边缘特征的帧内通用视频编码快速算法。
Sensors (Basel). 2023 Jul 7;23(13):6244. doi: 10.3390/s23136244.
6
Fast CU Partition Algorithm for Intra Frame Coding Based on Joint Texture Classification and CNN.基于联合纹理分类与卷积神经网络的帧内编码快速CU划分算法
Sensors (Basel). 2023 Sep 15;23(18):7923. doi: 10.3390/s23187923.
7
Partition Map Prediction for Fast Block Partitioning in VVC Intra-Frame Coding.VVC 帧内编码中快速分块的分区图预测。
IEEE Trans Image Process. 2023;32:2237-2251. doi: 10.1109/TIP.2023.3266165. Epub 2023 Apr 21.
8
A Fast Decision Algorithm for VVC Intra-Coding Based on Texture Feature and Machine Learning.基于纹理特征和机器学习的 VVC 帧内编码快速决策算法。
Comput Intell Neurosci. 2022 Sep 13;2022:7675749. doi: 10.1155/2022/7675749. eCollection 2022.
9
Low-Complexity Error Resilient HEVC Video Coding: A Deep Learning Approach.低复杂度抗误码高效视频编码:一种深度学习方法。
IEEE Trans Image Process. 2021;30:1245-1260. doi: 10.1109/TIP.2020.3043124. Epub 2020 Dec 21.
10
Learned Fast HEVC Intra Coding.学习型快速高效视频编码(HEVC)帧内编码
IEEE Trans Image Process. 2020 Mar 30. doi: 10.1109/TIP.2020.2982832.

引用本文的文献

1
QP-Adaptive Dual-Path Residual Integrated Frequency Transformer for Data-Driven In-Loop Filter in VVC.用于VVC中数据驱动环路滤波器的QP自适应双路径残差集成频率变压器
Sensors (Basel). 2025 Jul 7;25(13):4234. doi: 10.3390/s25134234.
2
NRVC: Neural Representation for Video Compression with Implicit Multiscale Fusion Network.NRVC:基于隐式多尺度融合网络的视频压缩神经表示
Entropy (Basel). 2023 Aug 4;25(8):1167. doi: 10.3390/e25081167.
3
A Fast Algorithm for Intra-Frame Versatile Video Coding Based on Edge Features.一种基于边缘特征的帧内通用视频编码快速算法。
Sensors (Basel). 2023 Jul 7;23(13):6244. doi: 10.3390/s23136244.
4
A Study on Fast and Low-Complexity Algorithms for Versatile Video Coding.通用视频编码的快速低复杂度算法研究
Sensors (Basel). 2022 Nov 20;22(22):8990. doi: 10.3390/s22228990.
5
Temporal Prediction Model-Based Fast Inter CU Partition for Versatile Video Coding.基于时域预测模型的灵活视频编码快速交叉 CU 分区。
Sensors (Basel). 2022 Oct 12;22(20):7741. doi: 10.3390/s22207741.
6
OpenVVC Decoder Parameterized and Interfaced Synchronous Dataflow (PiSDF) Model: Tile Based Parallelism.OpenVVC解码器参数化且接口化的同步数据流(PiSDF)模型:基于瓦片的并行性。
J Signal Process Syst. 2022 Oct 14:1-13. doi: 10.1007/s11265-022-01819-7.
7
Object-Cooperated Ternary Tree Partitioning Decision Method for Versatile Video Coding.面向多功能视频编码的目标协作三元树分割决策方法。
Sensors (Basel). 2022 Aug 23;22(17):6328. doi: 10.3390/s22176328.
8
Machine Learning for Multimedia Communications.多媒体通信中的机器学习。
Sensors (Basel). 2022 Jan 21;22(3):819. doi: 10.3390/s22030819.