Graduate Institute of Electronics Engineering, National Taiwan University, Taipei 10617, Taiwan.
IEEE Trans Image Process. 2012 May;21(5):2592-606. doi: 10.1109/TIP.2011.2177990. Epub 2011 Dec 2.
In this paper, we present a theoretical analysis of the distortion in multilayer coding structures. Specifically, we analyze the prediction structure used to achieve temporal, spatial, and quality scalability of scalable video coding (SVC) and show that the average peak signal-to-noise ratio (PSNR) of SVC is a weighted combination of the bit rates assigned to all the streams. Our analysis utilizes the end user's preference for certain resolutions. We also propose a rate-distortion (R-D) optimization algorithm and compare its performance with that of a state-of-the-art scalable bit allocation algorithm. The reported experiment results demonstrate that the R-D algorithm significantly outperforms the compared approach in terms of the average PSNR.
在本文中,我们对多层编码结构中的失真进行了理论分析。具体来说,我们分析了用于实现可伸缩视频编码(SVC)的时间、空间和质量可伸缩性的预测结构,并表明 SVC 的平均峰值信噪比(PSNR)是分配给所有流的比特率的加权组合。我们的分析利用了最终用户对某些分辨率的偏好。我们还提出了一种率失真(R-D)优化算法,并将其性能与一种最先进的可伸缩比特分配算法进行了比较。报告的实验结果表明,该 R-D 算法在平均 PSNR 方面明显优于比较方法。