Piriou Gwenaëlle, Bouthemy Patrick, Yao Jian-Feng
IRISA/INRIA, 35042, Rennes, France.
IEEE Trans Image Process. 2006 Nov;15(11):3417-30. doi: 10.1109/tip.2006.881963.
The exploitation of video data requires methods able to extract high-level information from the images. Video summarization, video retrieval, or video surveillance are examples of applications. In this paper, we tackle the challenging problem of recognizing dynamic video contents from low-level motion features. We adopt a statistical approach involving modeling, (supervised) learning, and classification issues. Because of the diversity of video content (even for a given class of events), we have to design appropriate models of visual motion and learn them from videos. We have defined original parsimonious global probabilistic motion models, both for the dominant image motion (assumed to be due to the camera motion) and the residual image motion (related to scene motion). Motion measurements include affine motion models to capture the camera motion and low-level local motion features to account for scene motion. Motion learning and recognition are solved using maximum likelihood criteria. To validate the interest of the proposed motion modeling and recognition framework, we report dynamic content recognition results on sports videos.
视频数据的利用需要能够从图像中提取高级信息的方法。视频摘要、视频检索或视频监控都是应用示例。在本文中,我们解决了从低级运动特征识别动态视频内容这一具有挑战性的问题。我们采用一种涉及建模、(监督)学习和分类问题的统计方法。由于视频内容的多样性(即使对于给定的事件类别),我们必须设计合适的视觉运动模型并从视频中学习它们。我们已经定义了原始的简约全局概率运动模型,既用于主导图像运动(假定是由于相机运动),也用于残余图像运动(与场景运动相关)。运动测量包括用于捕获相机运动的仿射运动模型和用于考虑场景运动的低级局部运动特征。运动学习和识别使用最大似然准则来解决。为了验证所提出的运动建模和识别框架的价值,我们报告了体育视频上的动态内容识别结果。