IEEE Trans Image Process. 2015 Jul;24(7):2225-38. doi: 10.1109/TIP.2015.2417498.
In this paper, we propose a machine learning-based fast coding unit (CU) depth decision method for High Efficiency Video Coding (HEVC), which optimizes the complexity allocation at CU level with given rate-distortion (RD) cost constraints. First, we analyze quad-tree CU depth decision process in HEVC and model it as a three-level of hierarchical binary decision problem. Second, a flexible CU depth decision structure is presented, which allows the performances of each CU depth decision be smoothly transferred between the coding complexity and RD performance. Then, a three-output joint classifier consists of multiple binary classifiers with different parameters is designed to control the risk of false prediction. Finally, a sophisticated RD-complexity model is derived to determine the optimal parameters for the joint classifier, which is capable of minimizing the complexity in each CU depth at given RD degradation constraints. Comparative experiments over various sequences show that the proposed CU depth decision algorithm can reduce the computational complexity from 28.82% to 70.93%, and 51.45% on average when compared with the original HEVC test model. The Bjøntegaard delta peak signal-to-noise ratio and Bjøntegaard delta bit rate are -0.061 dB and 1.98% on average, which is negligible. The overall performance of the proposed algorithm outperforms those of the state-of-the-art schemes.
在本文中,我们提出了一种基于机器学习的快速编码单元 (CU) 深度决策方法,用于高效视频编码 (HEVC),该方法在给定率失真 (RD) 成本约束下优化了 CU 级别的复杂度分配。首先,我们分析了 HEVC 中四叉树 CU 深度决策过程,并将其建模为三级分层二进制决策问题。其次,提出了一种灵活的 CU 深度决策结构,允许每个 CU 深度决策的性能在编码复杂度和 RD 性能之间平滑转换。然后,设计了一个由多个具有不同参数的二进制分类器组成的三输出联合分类器,以控制误判的风险。最后,推导出一种复杂的 RD 复杂度模型,以确定联合分类器的最优参数,从而能够在给定的 RD 劣化约束下最小化每个 CU 深度的复杂度。在各种序列上的对比实验表明,与原始 HEVC 测试模型相比,所提出的 CU 深度决策算法可以将计算复杂度降低 28.82%到 70.93%,平均降低 51.45%。Bjøntegaard 峰信噪比和 Bjøntegaard 比特率的平均差值分别为-0.061 dB 和 1.98%,可以忽略不计。所提出算法的整体性能优于现有技术方案。