Barbu Adrian, Zhu Song-Chun
Department of Computer Science, University of California, Los Angeles, 8125 Math Science Bldg., Los Angeles, CA 90095, USA.
IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1239-53. doi: 10.1109/TPAMI.2005.161.
Many vision tasks can be formulated as graph partition problems that minimize energy functions. For such problems, the Gibbs sampler provides a general solution but is very slow, while other methods, such as Ncut and graph cuts are computationally effective but only work for specific energy forms and are not generally applicable. In this paper, we present a new inference algorithm that generalizes the Swendsen-Wang method to arbitrary probabilities defined on graph partitions. We begin by computing graph edge weights, based on local image features. Then, the algorithm iterates two steps. 1) Graph clustering: It forms connected components by cutting the edges probabilistically based on their weights. 2) Graph relabeling: It selects one connected component and flips probabilistically, the coloring of all vertices in the component simultaneously. Thus, it realizes the split, merge, and regrouping of a "chunk" of the graph, in contrast to Gibbs sampler that flips a single vertex. We prove that this algorithm simulates ergodic and reversible Markov chain jumps in the space of graph partitions and is applicable to arbitrary posterior probabilities or energy functions defined on graphs. We demonstrate the algorithm on two typical problems in computer vision--image segmentation and stereo vision. Experimentally, we show that it is 100-400 times faster in CPU time than the classical Gibbs sampler and 20-40 times faster then the DDMCMC segmentation algorithm. For stereo, we compare performance with graph cuts and belief propagation. We also show that our algorithm can automatically infer generative models and obtain satisfactory results (better than the graphic cuts or belief propagation) in the same amount of time.
许多视觉任务都可以被表述为最小化能量函数的图划分问题。对于此类问题,吉布斯采样器提供了一种通用解决方案,但速度非常慢,而其他方法,如归一化割(Ncut)和图割,在计算上很有效,但仅适用于特定的能量形式,并不具有普遍适用性。在本文中,我们提出了一种新的推理算法,该算法将斯文森 - 王(Swendsen-Wang)方法推广到定义在图划分上的任意概率。我们首先基于局部图像特征计算图的边权重。然后,该算法迭代两个步骤。1)图聚类:它根据边的权重概率性地切断边来形成连通分量。2)图重新标记:它选择一个连通分量并概率性地同时翻转该分量中所有顶点的着色。因此,与翻转单个顶点的吉布斯采样器不同,它实现了图的一个“块”的分裂、合并和重新组合。我们证明该算法在图划分空间中模拟遍历且可逆的马尔可夫链跳跃,并且适用于定义在图上的任意后验概率或能量函数。我们在计算机视觉中的两个典型问题——图像分割和立体视觉上演示了该算法。通过实验,我们表明它在CPU时间上比经典的吉布斯采样器快100 - 400倍,比DDMCMC分割算法快20 - 40倍。对于立体视觉,我们将其性能与图割和置信传播进行比较。我们还表明,我们的算法可以自动推断生成模型,并在相同时间内获得令人满意的结果(优于图割或置信传播)。