Jeong Seunghwa, Kim Bumki, Cha Seunghoon, Seo Kwanggyoon, Chang Hayoung, Lee Jungjin, Kim Younghui, Noh Junyong
IEEE Trans Pattern Anal Mach Intell. 2024 Sep;46(9):6023-6039. doi: 10.1109/TPAMI.2024.3377372. Epub 2024 Aug 6.
We propose a real-time convolutional neural network (CNN) training and compression method for delivering high-quality live video even in a poor network environment. The server delivers a low-resolution video segment along with the corresponding CNN for super resolution (SR), after which the client applies the CNN to the segment in order to recover high-resolution video frames. To generate a trained CNN corresponding to a video segment in real-time, our method rapidly increases the training accuracy by promoting the overfitting property of the CNN while also using curriculum-based training. In addition, assuming that the pretrained CNN is already downloaded on the client side, we transfer only residual values between the updated and pretrained CNN parameters. These values can be quantized with low bits in real time while minimizing the amount of loss, as the distribution range is significantly narrower than that of the updated CNN. Quantitatively, our neural-enhanced adaptive live streaming pipeline (NEALS) achieves higher SR accuracy and a lower CNN compression loss rate within a constrained training time compared to the state-of-the-art CNN training and compression method. NEALS achieves 15 to 48% higher quality of the user experience compared to state-of-the-art neural-enhanced live streaming systems.
我们提出了一种实时卷积神经网络(CNN)训练和压缩方法,即使在网络环境较差的情况下也能传输高质量的实时视频。服务器会传输低分辨率视频片段以及用于超分辨率(SR)的相应CNN,之后客户端将CNN应用于该片段以恢复高分辨率视频帧。为了实时生成与视频片段对应的经过训练的CNN,我们的方法通过促进CNN的过拟合特性并同时使用基于课程的训练来快速提高训练精度。此外,假设预训练的CNN已经在客户端下载,我们只传输更新后的CNN参数与预训练的CNN参数之间的残差。由于这些值的分布范围比更新后的CNN明显更窄,因此可以在将损失量降至最低的同时实时进行低比特量化。从数量上看,与最先进的CNN训练和压缩方法相比,我们的神经增强自适应实时流管道(NEALS)在受限的训练时间内实现了更高的SR精度和更低的CNN压缩损失率。与最先进的神经增强实时流系统相比,NEALS实现的用户体验质量要高出15%至48%。