Microsoft Research New England, Microsoft Corp., One Memorial Drive, Cambridge, MA 02142, USA.
IEEE Trans Pattern Anal Mach Intell. 2011 May;33(5):978-94. doi: 10.1109/TPAMI.2010.147.
While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow, where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes. The SIFT flow algorithm consists of matching densely sampled, pixelwise SIFT features between two images while preserving spatial discontinuities. The SIFT features allow robust matching across different scene/object appearances, whereas the discontinuity-preserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach robustly aligns complex scene pairs containing significant spatial differences. Based on SIFT flow, we propose an alignment-based large database framework for image analysis and synthesis, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. This framework is demonstrated through concrete applications such as motion field prediction from a single image, motion synthesis via object transfer, satellite image registration, and face recognition.
虽然图像配准在计算机视觉的不同领域已经研究了几十年,但对齐描绘不同场景的图像仍然是一个具有挑战性的问题。类似于光流,将一个图像与它的时间相邻帧对齐,我们提出了 SIFT 流,一种将图像与其在包含各种场景的大型图像语料库中的最近邻对齐的方法。SIFT 流算法包括在两幅图像之间匹配密集采样的、逐像素的 SIFT 特征,同时保持空间不连续性。SIFT 特征允许在不同的场景/对象外观之间进行稳健匹配,而保持空间不连续性的空间模型允许匹配位于场景不同部分的对象。实验表明,所提出的方法能够稳健地对齐包含显著空间差异的复杂场景对。基于 SIFT 流,我们提出了一种基于对齐的大型数据库框架,用于图像分析和合成,根据密集的场景对应关系,将图像信息从最近邻传递到查询图像。该框架通过具体应用进行了演示,例如从单张图像预测运动场、通过对象传输进行运动合成、卫星图像配准和人脸识别。