Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan.
IEEE Trans Image Process. 2013 Jul;22(7):2600-10. doi: 10.1109/TIP.2013.2253483. Epub 2013 Mar 20.
This paper presents a saliency-based video object extraction (VOE) framework. The proposed framework aims to automatically extract foreground objects of interest without any user interaction or the use of any training data (i.e., not limited to any particular type of object). To separate foreground and background regions within and across video frames, the proposed method utilizes visual and motion saliency information extracted from the input video. A conditional random field is applied to effectively combine the saliency induced features, which allows us to deal with unknown pose and scale variations of the foreground object (and its articulated parts). Based on the ability to preserve both spatial continuity and temporal consistency in the proposed VOE framework, experiments on a variety of videos verify that our method is able to produce quantitatively and qualitatively satisfactory VOE results.
本文提出了一种基于显著度的视频目标提取(VOE)框架。所提出的框架旨在无需任何用户交互或使用任何训练数据(即不限于任何特定类型的对象)的情况下自动提取感兴趣的前景对象。为了在视频帧内和跨视频帧分离前景和背景区域,所提出的方法利用从输入视频中提取的视觉和运动显著度信息。条件随机场被应用于有效地组合显著度诱导特征,这使得我们能够处理前景对象(及其铰接部分)的未知姿势和尺度变化。基于在提出的 VOE 框架中保持空间连续性和时间一致性的能力,对各种视频的实验验证了我们的方法能够产生定量和定性上令人满意的 VOE 结果。