Zhao Xin, Chen Jianle, Karczewicz Marta, Said Amir, Seregin Vadim
IEEE Trans Image Process. 2018 Feb 5. doi: 10.1109/TIP.2018.2802202.
Throughout the past few decades, the separable Discrete Cosine Transform (DCT), particularly the DCT type II, has been widely used in image and video compression. It is well known that, under first-order stationary Markov conditions, DCT is an efficient approximation of the optimal Karhunen-Loève transform. However, for natural image and video sources, the adaptivity of a single separable transform with fixed core is rather limited for the highly dynamic image statistics, e.g., textures and arbitrarily directed edges. It is also known that non-separable transforms can achieve better compression efficiency for images with directional texture patterns, yet they are computationally complex, especially when the transform size is large. In order to achieve higher transform coding gains with relatively low-complexity implementations, we propose a joint separable and non-separable transform. The proposed separable primary transform, named Enhanced Multiple Transform (EMT), applies multiple transform cores from a pre-defined subset of sinusoidal transforms, and the transform selection is signaled in a joint block level manner. Moreover, a Non-Separable Secondary Transform (NSST) method is proposed to operate in conjunction with EMT. Unlike the existing non-separable transform schemes which require excessive amounts of memory and computation, the proposed NSST efficiently improves coding gain with much lower complexity. Extensive experimental results show that the proposed methods, in a state-of-the-art video codec, such as HEVC, can provide significant coding gains (average 6.9% and 4.5% bitrate reductions for intra and random-access coding, respectively).
在过去几十年中,可分离离散余弦变换(DCT),特别是II型DCT,已广泛应用于图像和视频压缩。众所周知,在一阶平稳马尔可夫条件下,DCT是最优卡尔胡宁-洛伊夫变换的有效近似。然而,对于自然图像和视频源,具有固定核的单个可分离变换对于高度动态的图像统计信息(例如纹理和任意方向的边缘)的适应性相当有限。还已知非可分离变换对于具有方向性纹理图案的图像可以实现更好的压缩效率,但它们计算复杂,尤其是当变换尺寸较大时。为了以相对低复杂度的实现获得更高的变换编码增益,我们提出了一种联合可分离和非可分离变换。所提出的可分离主变换,称为增强多变换(EMT),应用来自正弦变换预定义子集的多个变换核,并且变换选择以联合块级方式进行信号传输。此外,还提出了一种非可分离二次变换(NSST)方法与EMT协同操作。与现有的需要大量内存和计算的非可分离变换方案不同,所提出的NSST以低得多 的复杂度有效地提高了编码增益。大量实验结果表明,所提出的方法在诸如HEVC这样的最新视频编解码器中,可以提供显著的编码增益(帧内编码和随机访问编码分别平均降低比特率6.9%和4.5%)。