Suppr超能文献

用于前景视频编码的时空在线字典学习的稀疏表示

Sparse Representation With Spatio-Temporal Online Dictionary Learning for Promising Video Coding.

出版信息

IEEE Trans Image Process. 2016 Oct;25(10):4580-4595. doi: 10.1109/TIP.2016.2594490. Epub 2016 Jul 27.

Abstract

Classical dictionary learning methods for video coding suffer from high computational complexity and interfered coding efficiency by disregarding its underlying distribution. This paper proposes a spatio-temporal online dictionary learning (STOL) algorithm to speed up the convergence rate of dictionary learning with a guarantee of approximation error. The proposed algorithm incorporates stochastic gradient descents to form a dictionary of pairs of 3D low-frequency and high-frequency spatio-temporal volumes. In each iteration of the learning process, it randomly selects one sample volume and updates the atoms of dictionary by minimizing the expected cost, rather than optimizes empirical cost over the complete training data, such as batch learning methods, e.g., K-SVD. Since the selected volumes are supposed to be independent identically distributed samples from the underlying distribution, decomposition coefficients attained from the trained dictionary are desirable for sparse representation. Theoretically, it is proved that the proposed STOL could achieve better approximation for sparse representation than K-SVD and maintain both structured sparsity and hierarchical sparsity. It is shown to outperform batch gradient descent methods (K-SVD) in the sense of convergence speed and computational complexity, and its upper bound for prediction error is asymptotically equal to the training error. With lower computational complexity, extensive experiments validate that the STOL-based coding scheme achieves performance improvements than H.264/AVC or High Efficiency Video Coding as well as existing super-resolution-based methods in rate-distortion performance and visual quality.

摘要

用于视频编码的经典字典学习方法存在计算复杂度高的问题,并且由于忽略其潜在分布而干扰了编码效率。本文提出了一种时空在线字典学习(STOL)算法,以在保证近似误差的情况下加快字典学习的收敛速度。所提出的算法结合了随机梯度下降,以形成由3D低频和高频时空体积对组成的字典。在学习过程的每次迭代中,它随机选择一个样本体积,并通过最小化期望成本来更新字典的原子,而不是像批量学习方法(例如K-SVD)那样在完整的训练数据上优化经验成本。由于所选体积被认为是来自潜在分布的独立同分布样本,因此从训练字典中获得的分解系数对于稀疏表示是理想的。理论上,证明了所提出的STOL在稀疏表示方面比K-SVD能实现更好的近似,并且能同时保持结构稀疏性和层次稀疏性。在收敛速度和计算复杂度方面,它被证明优于批量梯度下降方法(K-SVD),并且其预测误差的上限渐近等于训练误差。具有较低的计算复杂度,大量实验验证了基于STOL的编码方案在率失真性能和视觉质量方面比H.264/AVC或高效视频编码以及现有的基于超分辨率的方法实现了性能提升。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验