School of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Korea.
Department of Computer Engineering, Dong-A University, Busan 49315, Korea.
Sensors (Basel). 2022 Aug 23;22(17):6328. doi: 10.3390/s22176328.
In this paper, we propose an object-cooperated decision method for efficient ternary tree (TT) partitioning that reduces the encoding complexity of versatile video coding (VVC). In most previous studies, the VVC complexity was reduced using decision schemes based on the encoding context, which do not apply object detecion models. We assume that high-level objects are important for deciding whether complex TT partitioning is required because they can provide hints on the characteristics of a video. Herein, we apply an object detection model that discovers and extracts the high-level object features-the number and ratio of objects from frames in a video sequence. Using the extracted features, we propose machine learning (ML)-based classifiers for each TT-split direction to efficiently reduce the encoding complexity of VVC and decide whether the TT-split process can be skipped in the vertical or horizontal direction. The TT-split decision of classifiers is formulated as a binary classification problem. Experimental results show that the proposed method more effectively decreases the encoding complexity of VVC than a state-of-the-art model based on ML.
在本文中,我们提出了一种用于高效三元树 (TT) 划分的目标协作决策方法,以降低通用视频编码 (VVC) 的编码复杂度。在大多数先前的研究中,使用基于编码上下文的决策方案来降低 VVC 的复杂度,这些方案不适用对象检测模型。我们假设高层对象对于决定是否需要复杂的 TT 划分很重要,因为它们可以提供有关视频特征的提示。在此,我们应用对象检测模型来发现和提取高层对象特征——从视频序列中的帧中提取对象的数量和比例。使用提取的特征,我们为每个 TT 分裂方向提出基于机器学习 (ML) 的分类器,以有效地降低 VVC 的编码复杂度,并决定是否可以跳过垂直或水平方向的 TT 分裂过程。分类器的 TT 分裂决策被表述为一个二进制分类问题。实验结果表明,与基于 ML 的最新模型相比,所提出的方法更有效地降低了 VVC 的编码复杂度。