IEEE Trans Image Process. 2017 Dec;26(12):6074-6089. doi: 10.1109/TIP.2017.2745099. Epub 2017 Aug 25.
In this paper, a joint machine learning and game theory modeling (MLGT) framework is proposed for inter frame coding tree unit (CTU) level bit allocation and rate control (RC) optimization in high efficiency video coding (HEVC). First, a support vector machine-based multi-classification scheme is proposed to improve the prediction accuracy of CTU-level rate-distortion (R-D) model. The legacy "chicken-and-egg" dilemma in video coding is proposed to be overcome by the learning-based R-D model. Second, a mixed R-D model-based cooperative bargaining game theory is proposed for bit allocation optimization, where the convexity of the mixed R-D model-based utility function is proved, and Nash bargaining solution is achieved by the proposed iterative solution search method. The minimum utility is adjusted by the reference coding distortion and frame-level quantization parameter (QP) change. Finally, intra frame QP and inter frame adaptive bit ratios are adjusted to make inter frames have more bit resources to maintain smooth quality and bit consumption in the bargaining game optimization. Experimental results demonstrate that the proposed MLGT-based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results, and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R-D performances are very close to the performance limits from the FixedQP method.
本文提出了一种联合机器学习和博弈论建模(MLGT)框架,用于高效视频编码(HEVC)中的帧间编码树单元(CTU)级比特分配和率控制(RC)优化。首先,提出了一种基于支持向量机的多分类方案,以提高 CTU 级率失真(R-D)模型的预测精度。通过基于学习的 R-D 模型解决了视频编码中的传统“先有鸡还是先有蛋”的难题。其次,提出了一种基于混合 R-D 模型的合作博弈论的比特分配优化方法,其中证明了基于混合 R-D 模型的效用函数的凸性,并通过所提出的迭代求解搜索方法实现纳什讨价还价解。通过参考编码失真和帧级量化参数(QP)变化来调整最小效用。最后,调整帧内 QP 和帧间自适应比特比,以使帧间有更多的比特资源,在讨价还价游戏优化中保持平滑的质量和比特消耗。实验结果表明,与其他最先进的单遍 RC 方法相比,所提出的基于 MLGT 的 RC 方法可以实现更好的 R-D 性能、质量平滑性、比特率准确性、缓冲区控制结果和主观视觉质量,并且所达到的 R-D 性能非常接近来自 FixedQP 方法的性能极限。