IEEE Trans Pattern Anal Mach Intell. 2014 Jun;36(6):1187-200. doi: 10.1109/TPAMI.2013.242.
Motion is a strong cue for unsupervised object-level grouping. In this paper, we demonstrate that motion will be exploited most effectively, if it is regarded over larger time windows. Opposed to classical two-frame optical flow, point trajectories that span hundreds of frames are less susceptible to short-term variations that hinder separating different objects. As a positive side effect, the resulting groupings are temporally consistent over a whole video shot, a property that requires tedious post-processing in the vast majority of existing approaches. We suggest working with a paradigm that starts with semi-dense motion cues first and that fills up textureless areas afterwards based on color. This paper also contributes the Freiburg-Berkeley motion segmentation (FBMS) dataset, a large, heterogeneous benchmark with 59 sequences and pixel-accurate ground truth annotation of moving objects.
运动是无监督目标级分组的有力线索。在本文中,我们证明,如果将运动视为更大的时间窗口,运动将被最有效地利用。与经典的两帧光流相比,跨越数百帧的点轨迹不太容易受到短期变化的影响,这些短期变化会阻碍不同物体的分离。作为一个积极的副作用,由此产生的分组在整个视频拍摄期间在时间上是一致的,而在绝大多数现有的方法中,这是一个需要繁琐的后期处理的特性。我们建议采用一种从半密集运动线索开始的范例,然后根据颜色填充无纹理区域。本文还贡献了弗莱堡-伯克利运动分割(FBMS)数据集,这是一个具有 59 个序列和运动对象像素级精确地面实况注释的大型异构基准。