IEEE Trans Image Process. 2017 Jun;26(6):2972-2987. doi: 10.1109/TIP.2017.2692882. Epub 2017 Apr 12.
Dictionary learning has emerged as a promising alternative to the conventional hybrid coding framework. However, the rigid structure of sequential training and prediction degrades its performance in scalable video coding. This paper proposes a progressive dictionary learning framework with hierarchical predictive structure for scalable video coding, especially in low bitrate region. For pyramidal layers, sparse representation based on spatio-temporal dictionary is adopted to improve the coding efficiency of enhancement layers with a guarantee of reconstruction performance. The overcomplete dictionary is trained to adaptively capture local structures along motion trajectories as well as exploit the correlations between the neighboring layers of resolutions. Furthermore, progressive dictionary learning is developed to enable the scalability in temporal domain and restrict the error propagation in a closed-loop predictor. Under the hierarchical predictive structure, online learning is leveraged to guarantee the training and prediction performance with an improved convergence rate. To accommodate with the state-of-the-art scalable extension of H.264/AVC and latest High Efficiency Video Coding (HEVC), standardized codec cores are utilized to encode the base and enhancement layers. Experimental results show that the proposed method outperforms the latest scalable extension of HEVC and HEVC simulcast over extensive test sequences with various resolutions.
字典学习已成为传统混合编码框架的一种很有前途的替代方法。然而,顺序训练和预测的刚性结构降低了其在可扩展视频编码中的性能。本文提出了一种用于可扩展视频编码的渐进式字典学习框架,具有分层预测结构,特别是在低比特率区域。对于金字塔层,采用基于时空字典的稀疏表示来提高增强层的编码效率,同时保证重建性能。训练过完备字典以自适应地捕获运动轨迹上的局部结构,并利用分辨率的相邻层之间的相关性。此外,渐进式字典学习用于在时域中实现可扩展性,并限制在闭环预测器中的误差传播。在分层预测结构下,利用在线学习来保证训练和预测性能,同时提高收敛速度。为了适应 H.264/AVC 的最新可扩展扩展和最新的高效视频编码 (HEVC),标准化的编解码器内核用于编码基础层和增强层。实验结果表明,该方法在各种分辨率的广泛测试序列上优于最新的 HEVC 可扩展扩展和 HEVC 联播。