Telecommunications Research Center, Department of Electrical Engineering, Arizona State University, Tempe, AZ 85287-7206, USA.
IEEE Trans Image Process. 2000;9(9):1472-83. doi: 10.1109/83.862622.
Most existing efforts in image and video compression have focused on developing methods to minimize not perceptual but rather mathematically tractable, easy to measure, distortion metrics. While nonperceptual distortion measures were found to be reasonably reliable for higher bit rates (high-quality applications), they do not correlate well with the perceived quality at lower bit rates and they fail to guarantee preservation of important perceptual qualities in the reconstructed images despite the potential for a good signal-to-noise ratio (SNR). This paper presents a perceptual-based image coder, which discriminates between image components based on their perceptual relevance for achieving increased performance in terms of quality and bit rate. The new coder is based on a locally adaptive perceptual quantization scheme for compressing the visual data. Our strategy is to exploit human visual masking properties by deriving visual masking thresholds in a locally adaptive fashion based on a subband decomposition. The derived masking thresholds are used in controlling the quantization stage by adapting the quantizer reconstruction levels to the local amount of masking present at the level of each subband transform coefficient. Compared to the existing non-locally adaptive perceptual quantization methods, the new locally adaptive algorithm exhibits superior performance and does not require additional side information. This is accomplished by estimating the amount of available masking from the already quantized data and linear prediction of the coefficient under consideration. By virtue of the local adaptation, the proposed quantization scheme is able to remove a large amount of perceptually redundant information. Since the algorithm does not require additional side information, it yields a low entropy representation of the image and is well suited for perceptually lossless image compression.
大多数现有的图像和视频压缩工作都集中在开发方法上,以最小化不是感知的,而是数学上可处理的、易于测量的、失真度量。虽然非感知失真度量在较高比特率(高质量应用)下被发现是相当可靠的,但它们与较低比特率下的感知质量相关性不大,并且不能保证在重建图像中保留重要的感知质量,尽管有良好的信噪比(SNR)的潜力。本文提出了一种基于感知的图像编码器,它根据其对实现质量和比特率提高的感知相关性来区分图像分量。新的编码器基于局部自适应感知量化方案,用于压缩视觉数据。我们的策略是通过基于子带分解的局部自适应方式导出视觉掩蔽阈值,利用人类视觉掩蔽特性。导出的掩蔽阈值用于通过将量化器重建水平适应于每个子带变换系数级别的局部掩蔽量来控制量化阶段。与现有的非局部自适应感知量化方法相比,新的局部自适应算法具有优越的性能,并且不需要额外的侧信息。这是通过从已经量化的数据中估计可用掩蔽量并对正在考虑的系数进行线性预测来实现的。由于局部自适应,所提出的量化方案能够去除大量感知冗余信息。由于该算法不需要额外的侧信息,因此它可以对图像进行低熵表示,非常适合用于感知无损图像压缩。