Suppr超能文献

基于可控先验引导抠图的实时多人视频合成

Real-Time Multi-Person Video Synthesis with Controllable Prior-Guided Matting.

作者信息

Chen Aoran, Huang Hai, Zhu Yueyan, Xue Junsheng

机构信息

School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.

出版信息

Sensors (Basel). 2024 Apr 27;24(9):2795. doi: 10.3390/s24092795.

Abstract

In order to enhance the matting performance in multi-person dynamic scenarios, we introduce a robust, real-time, high-resolution, and controllable human video matting method that achieves state of the art on all metrics. Unlike most existing methods that perform video matting frame by frame as independent images, we design a unified architecture using a controllable generation model to solve the problem of the lack of overall semantic information in multi-person video. Our method, called ControlMatting, uses an independent recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and detailed matting quality. ControlMatting adopts a mixed training strategy comprised of matting and a semantic segmentation dataset, which effectively improves the semantic understanding ability of the model. Furthermore, we propose a novel deep learning-based image filter algorithm that enforces our detailed augmentation ability on both matting and segmentation objectives. Our experiments have proved that prior information about the human body from the image itself can effectively combat the defect masking problem caused by complex dynamic scenarios with multiple people.

摘要

为了增强多人动态场景下的抠图性能,我们引入了一种强大、实时、高分辨率且可控的人体视频抠图方法,该方法在所有指标上均达到了领先水平。与大多数现有方法将视频抠图逐帧作为独立图像进行处理不同,我们设计了一种统一的架构,使用可控生成模型来解决多人视频中整体语义信息缺失的问题。我们的方法名为ControlMatting,它采用独立的循环架构来利用视频中的时间信息,并在时间连贯性和精细抠图质量方面取得了显著提升。ControlMatting采用了由抠图和语义分割数据集组成的混合训练策略,有效提高了模型的语义理解能力。此外,我们提出了一种基于深度学习的新型图像滤波算法,该算法在抠图和分割目标上都增强了我们的精细增强能力。我们的实验证明,从图像本身获取的人体先验信息能够有效对抗由多人复杂动态场景导致的缺陷遮挡问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d761/11086136/c1f5621cf99b/sensors-24-02795-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验