Suppr超能文献

无时间信息的视频对象分割

Video Object Segmentation without Temporal Information.

作者信息

Maninis K-K, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixe L, Cremers D, Van Gool L

出版信息

IEEE Trans Pattern Anal Mach Intell. 2019 Jun;41(6):1515-1530. doi: 10.1109/TPAMI.2018.2838670. Epub 2018 May 23.

Abstract

Video Object Segmentation, and video processing in general, has been historically dominated by methods that rely on the temporal consistency and redundancy in consecutive video frames. When the temporal smoothness is suddenly broken, such as when an object is occluded, or some frames are missing in a sequence, the result of these methods can deteriorate significantly. This paper explores the orthogonal approach of processing each frame independently, i.e., disregarding the temporal information. In particular, it tackles the task of semi-supervised video object segmentation: the separation of an object from the background in a video, given its mask in the first frame. We present Semantic One-Shot Video Object Segmentation (OSVOS$^\mathrm {S}$S), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one shot). We show that instance-level semantic information, when combined effectively, can dramatically improve the results of our previous method, OSVOS. We perform experiments on two recent single-object video segmentation databases, which show that OSVOS$^\mathrm {S}$S is both the fastest and most accurate method in the state of the art. Experiments on multi-object video segmentation show that OSVOS$^\mathrm {S}$S obtains competitive results.

摘要

视频对象分割以及一般的视频处理,在历史上一直由依赖于连续视频帧中的时间一致性和冗余性的方法主导。当时间平滑性突然被打破时,例如当一个物体被遮挡,或者序列中某些帧缺失时,这些方法的结果会显著恶化。本文探索了独立处理每一帧的正交方法,即忽略时间信息。具体而言,它解决了半监督视频对象分割的任务:在给定视频第一帧中物体掩码的情况下,将物体从视频背景中分离出来。我们提出了语义一次性视频对象分割(OSVOS$^\mathrm {S}$S),它基于一个全卷积神经网络架构,该架构能够将在ImageNet上学习到的通用语义信息依次转移到前景分割任务中,并最终学习测试序列中单个标注物体的外观(因此是一次性)。我们表明,当有效结合实例级语义信息时,可以显著改善我们之前的方法OSVOS的结果。我们在两个最近的单对象视频分割数据库上进行了实验,结果表明OSVOS$^\mathrm {S}$S是目前最快速且最准确的方法。多对象视频分割实验表明,OSVOS$^\mathrm {S}$S也取得了有竞争力的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验