Suppr超能文献

基于人类视频感知表示的盲视频质量预测

Blind Video Quality Prediction by Uncovering Human Video Perceptual Representation.

出版信息

IEEE Trans Image Process. 2024;33:4998-5013. doi: 10.1109/TIP.2024.3445738. Epub 2024 Sep 17.

Abstract

Blind video quality assessment (VQA) has become an increasingly demanding problem in automatically assessing the quality of ever-growing in-the-wild videos. Although efforts have been made to measure temporal distortions, the core to distinguish between VQA and image quality assessment (IQA), the lack of modeling of how the human visual system (HVS) relates to the temporal quality of videos hinders the precise mapping of predicted temporal scores to the human perception. Inspired by the recent discovery of the temporal straightness law of natural videos in the HVS, this paper intends to model the complex temporal distortions of in-the-wild videos in a simple and uniform representation by describing the geometric properties of videos in the visual perceptual domain. A novel videolet, with perceptual representation embedding of a few consecutive frames, is designed as the basic quality measurement unit to quantify temporal distortions by measuring the angular and linear displacements from the straightness law. By combining the predicted score on each videolet, a perceptually temporal quality evaluator (PTQE) is formed to measure the temporal quality of the entire video. Experimental results demonstrate that the perceptual representation in the HVS is an efficient way of predicting subjective temporal quality. Moreover, when combined with spatial quality metrics, PTQE achieves top performance over popular in-the-wild video datasets. More importantly, PTQE requires no additional information beyond the video being assessed, making it applicable to any dataset without parameter tuning. Additionally, the generalizability of PTQE is evaluated on video frame interpolation tasks, demonstrating its potential to benefit temporal-related enhancement tasks.

摘要

盲视频质量评估(VQA)在自动评估不断增长的野外视频的质量方面已成为一个日益苛刻的问题。尽管已经努力测量时间失真,但区分 VQA 和图像质量评估(IQA)的核心问题是缺乏对人类视觉系统(HVS)与视频时间质量之间关系的建模,这阻碍了将预测的时间分数精确映射到人类感知。受 HVS 中自然视频时间直线性规律的最新发现启发,本文旨在通过描述视觉感知域中视频的几何属性,以简单统一的表示方式对野外视频的复杂时间失真进行建模。设计了一种新颖的视频片段,它具有几个连续帧的感知表示嵌入,作为基本质量测量单元,通过测量偏离直线性规律的角度和线性位移来量化时间失真。通过组合每个视频片段的预测得分,形成了一种感知时间质量评估器(PTQE),用于测量整个视频的时间质量。实验结果表明,HVS 中的感知表示是预测主观时间质量的有效方法。此外,当与空间质量指标结合使用时,PTQE 在流行的野外视频数据集上实现了最佳性能。更重要的是,PTQE 除了要评估的视频之外不需要其他额外信息,因此适用于任何无需参数调整的数据集。此外,还评估了 PTQE 在视频帧插值任务中的泛化能力,表明它有可能有益于与时间相关的增强任务。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验