Suppr超能文献

SVCNet:基于涂鸦的具有时间聚合功能的视频色彩化网络。

SVCNet: Scribble-Based Video Colorization Network With Temporal Aggregation.

作者信息

Zhao Yuzhi, Po Lai-Man, Liu Kangcheng, Wang Xuehui, Yu Wing-Yin, Xian Pengfei, Zhang Yujia, Liu Mengyang

出版信息

IEEE Trans Image Process. 2023;32:4443-4458. doi: 10.1109/TIP.2023.3298537. Epub 2023 Aug 4.

Abstract

In this paper, we propose a scribble-based video colorization network with temporal aggregation called SVCNet. It can colorize monochrome videos based on different user-given color scribbles. It addresses three common issues in the scribble-based video colorization area: colorization vividness, temporal consistency, and color bleeding. To improve the colorization quality and strengthen the temporal consistency, we adopt two sequential sub-networks in SVCNet for precise colorization and temporal smoothing, respectively. The first stage includes a pyramid feature encoder to incorporate color scribbles with a grayscale frame, and a semantic feature encoder to extract semantics. The second stage finetunes the output from the first stage by aggregating the information of neighboring colorized frames (as short-range connections) and the first colorized frame (as a long-range connection). To alleviate the color bleeding artifacts, we learn video colorization and segmentation simultaneously. Furthermore, we set the majority of operations on a fixed small image resolution and use a Super-resolution Module at the tail of SVCNet to recover original sizes. It allows the SVCNet to fit different image resolutions at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo benchmarks. The experimental results demonstrate that SVCNet produces both higher-quality and more temporally consistent videos than other well-known video colorization approaches. The codes and models can be found at https://github.com/zhaoyuzhi/SVCNet.

摘要

在本文中,我们提出了一种基于涂鸦的具有时间聚合功能的视频上色网络,称为SVCNet。它可以根据用户给出的不同颜色涂鸦为单色视频上色。它解决了基于涂鸦的视频上色领域中的三个常见问题:上色的生动性、时间一致性和颜色渗色。为了提高上色质量并增强时间一致性,我们在SVCNet中采用了两个顺序子网,分别用于精确上色和时间平滑。第一阶段包括一个金字塔特征编码器,用于将颜色涂鸦与灰度帧合并,以及一个语义特征编码器,用于提取语义。第二阶段通过聚合相邻上色帧的信息(作为短程连接)和第一个上色帧的信息(作为长程连接)来微调第一阶段的输出。为了减轻颜色渗色伪影,我们同时学习视频上色和分割。此外,我们将大部分操作设置在固定的小图像分辨率上,并在SVCNet的尾部使用一个超分辨率模块来恢复原始大小。这使得SVCNet在推理时能够适应不同的图像分辨率。最后,我们在DAVIS和Videvo基准上评估了所提出的SVCNet。实验结果表明,SVCNet比其他知名的视频上色方法产生的视频质量更高,时间一致性更强。代码和模型可在https://github.com/zhaoyuzhi/SVCNet上找到。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验