Suppr超能文献

结合迁移学习与卷积神经网络的客观视频质量评估

Objective Video Quality Assessment Combining Transfer Learning With CNN.

作者信息

Zhang Yu, Gao Xinbo, He Lihuo, Lu Wen, He Ran

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):2716-2730. doi: 10.1109/TNNLS.2018.2890310. Epub 2019 Feb 6.

Abstract

Nowadays, video quality assessment (VQA) is essential to video compression technology applied to video transmission and storage. However, small-scale video quality databases with imbalanced samples and low-level feature representations for distorted videos impede the development of VQA methods. In this paper, we propose a full-reference (FR) VQA metric integrating transfer learning with a convolutional neural network (CNN). First, we imitate the feature-based transfer learning framework to transfer the distorted images as the related domain, which enriches the distorted samples. Second, to extract high-level spatiotemporal features of the distorted videos, a six-layer CNN with the acknowledged learning ability is pretrained and finetuned by the common features of the distorted image blocks (IBs) and video blocks (VBs), respectively. Notably, the labels of the distorted IBs and VBs are predicted by the classic FR metrics. Finally, based on saliency maps and the entropy function, we conduct a pooling stage to obtain the quality scores of the distorted videos by weighting the block-level scores predicted by the trained CNN. In particular, we introduce a preprocessing and a postprocessing to reduce the impact of inaccurate labels predicted by the FR-VQA metric. Due to feature learning in the proposed framework, two kinds of experimental schemes including train-test iterative procedures on one database and tests on one database with training other databases are carried out. The experimental results demonstrate that the proposed method has high expansibility and is on a par with some state-of-the-art VQA metrics on two widely used VQA databases with various compression distortions.

摘要

如今,视频质量评估(VQA)对于应用于视频传输和存储的视频压缩技术至关重要。然而,样本不均衡的小规模视频质量数据库以及失真视频的低级特征表示阻碍了VQA方法的发展。在本文中,我们提出了一种将迁移学习与卷积神经网络(CNN)相结合的全参考(FR)VQA度量。首先,我们模仿基于特征的迁移学习框架,将失真图像作为相关域进行迁移,从而丰富了失真样本。其次,为了提取失真视频的高级时空特征,分别通过失真图像块(IBs)和视频块(VBs)的公共特征对具有公认学习能力的六层CNN进行预训练和微调。值得注意的是,失真IBs和VBs的标签由经典的FR度量预测。最后,基于显著性图和熵函数,我们进行一个池化阶段,通过对训练后的CNN预测的块级分数进行加权来获得失真视频的质量分数。特别地,我们引入了一种预处理和后处理来减少FR-VQA度量预测的不准确标签的影响。由于在所提出的框架中进行了特征学习,我们开展了两种实验方案,包括在一个数据库上进行训练-测试迭代过程以及在一个数据库上进行测试并在其他数据库上进行训练。实验结果表明,所提出的方法具有很高的扩展性,并且在两个具有各种压缩失真的广泛使用的VQA数据库上与一些最先进的VQA度量相当。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验