IEEE Trans Image Process. 2021;30:3279-3292. doi: 10.1109/TIP.2021.3060255. Epub 2021 Mar 2.
Quality of experience (QoE) that serves as a direct evaluation of viewing experience from the end users is of vital importance for network optimization, and should be constantly monitored. Unlike existing video-on-demand streaming services, real-time interactivity is critical to the mobile live broadcasting experience for both broadcasters and their audiences. While existing QoE metrics that are validated on limited video contents and synthetic stall patterns have shown effectiveness in their trained QoE benchmarks, a common caveat is that they often encounter challenges in practical live broadcasting scenarios, where one needs to accurately understand the activity in the video with fluctuating QoE and figure out what is going to happen to support the real-time feedback to the broadcaster. In this paper, we propose a temporal relational reasoning guided QoE evaluation approach for mobile live video broadcasting, namely TRR-QoE, which explicitly attends to the temporal relationships between consecutive frames to achieve a more comprehensive understanding of the distortion-aware variation. In our design, video frames are first processed by deep neural network (DNN) to extract quality-indicative features. Afterwards, besides explicitly integrating features of individual frames to account for the spatial distortion information, multi-scale temporal relational information corresponding to diverse temporal resolutions are made full use of to capture temporal-distortion-aware variation. As a result, the overall QoE prediction could be derived by combining both aspects. The results of experiments conducted on a number of benchmark databases demonstrate the superiority of TRR-QoE over the representative state-of-the-art metrics.
体验质量(QoE)是用户对观看体验的直接评价,对于网络优化至关重要,应进行持续监测。与现有的视频点播流媒体服务不同,实时交互对于移动直播的主播和观众体验至关重要。虽然现有的 QoE 指标在有限的视频内容和合成卡顿模式上进行了验证,并在其训练的 QoE 基准上表现出了有效性,但一个常见的问题是,它们在实际的直播场景中经常遇到挑战,需要准确理解视频中的活动,以及波动的 QoE,并找出需要做什么来支持实时反馈给主播。在本文中,我们提出了一种面向移动直播视频的基于时间关系推理的 QoE 评估方法,即 TRR-QoE,它明确关注连续帧之间的时间关系,以实现对失真感知变化的更全面理解。在我们的设计中,视频帧首先通过深度神经网络(DNN)进行处理,以提取质量指示特征。然后,除了显式地集成各个帧的特征以考虑空间失真信息外,还充分利用对应不同时间分辨率的多尺度时间关系信息来捕获时间失真感知变化。因此,可以通过结合这两个方面来得出整体的 QoE 预测。在一些基准数据库上进行的实验结果表明,TRR-QoE 优于代表性的最先进指标。