IEEE Trans Image Process. 2016 Jul;25(7):3032-3043. doi: 10.1109/TIP.2016.2555705.
Existing color sampling-based alpha matting methods use the compositing equation to estimate alpha at a pixel from the pairs of foreground ( F ) and background ( B ) samples. The quality of the matte depends on the selected ( F,B ) pairs. In this paper, the matting problem is reinterpreted as a sparse coding of pixel features, wherein the sum of the codes gives the estimate of the alpha matte from a set of unpaired F and B samples. A non-parametric probabilistic segmentation provides a certainty measure on the pixel belonging to foreground or background, based on which a dictionary is formed for use in sparse coding. By removing the restriction to conform to ( F,B ) pairs, this method allows for better alpha estimation from multiple F and B samples. The same framework is extended to videos, where the requirement of temporal coherence is handled effectively. Here, the dictionary is formed by samples from multiple frames. A multi-frame graph model, as opposed to a single image as for image matting, is proposed that can be solved efficiently in closed form. Quantitative and qualitative evaluations on a benchmark dataset are provided to show that the proposed method outperforms the current stateoftheart in image and video matting.
现有的基于颜色采样的 alpha 抠图方法使用合成方程从前景 (F) 和背景 (B) 样本对中估计像素的 alpha 值。蒙板的质量取决于选定的 (F, B) 对。在本文中,抠图问题被重新解释为像素特征的稀疏编码,其中编码的和给出了一组未配对的 F 和 B 样本的 alpha 蒙板的估计。基于像素属于前景或背景的确定性度量,非参数概率分割为稀疏编码形成了一个字典。通过去除符合 (F, B) 对的限制,该方法允许从多个 F 和 B 样本中进行更好的 alpha 估计。相同的框架扩展到视频中,其中有效地处理了时间一致性的要求。在这里,字典是由多帧的样本形成的。与用于图像抠图的单图像相反,提出了多帧图模型,它可以有效地以闭式解求解。在基准数据集上进行的定量和定性评估表明,所提出的方法在图像和视频抠图方面优于当前的最先进技术。