Zhou Ziheng, Prügel-Bennett Adam, Damper Robert I
Information: Signals, Images, Systems Research Group, School of Electronics and Computer Science, University of Southampton, Highfield, UK.
IEEE Trans Pattern Anal Mach Intell. 2006 Nov;28(11):1738-52. doi: 10.1109/TPAMI.2006.214.
Extracting full-body motion of walking people from monocular video sequences in complex, real-world environments is an important and difficult problem, going beyond simple tracking, whose satisfactory solution demands an appropriate balance between use of prior knowledge and learning from data. We propose a consistent Bayesian framework for introducing strong prior knowledge into a system for extracting human gait. In this work, the strong prior is built from a simple articulated model having both time-invariant (static) and time-variant (dynamic) parameters. The model is easily modified to cater to situations such as walkers wearing clothing that obscures the limbs. The statistics of the parameters are learned from high-quality (indoor laboratory) data and the Bayesian framework then allows us to "bootstrap" to accurate gait extraction on the noisy images typical of cluttered, outdoor scenes. To achieve automatic fitting, we use a hidden Markov model to detect the phases of images in a walking cycle. We demonstrate our approach on silhouettes extracted from fronto-parallel ("sideways on") sequences of walkers under both high-quality indoor and noisy outdoor conditions. As well as high-quality data with synthetic noise and occlusions added, we also test walkers with rucksacks, skirts, and trench coats. Results are quantified in terms of chamfer distance and average pixel error between automatically extracted body points and corresponding hand-labeled points. No one part of the system is novel in itself, but the overall framework makes it feasible to extract gait from very much poorer quality image sequences than hitherto. This is confirmed by comparing person identification by gait using our method and a well-established baseline recognition algorithm.
在复杂的现实世界环境中,从单目视频序列中提取行人的全身运动是一个重要且困难的问题,它超越了简单的跟踪,其令人满意的解决方案需要在使用先验知识和从数据中学习之间取得适当的平衡。我们提出了一个一致的贝叶斯框架,用于将强大的先验知识引入到一个提取人类步态的系统中。在这项工作中,强大的先验是基于一个具有时不变(静态)和时变(动态)参数的简单关节模型构建的。该模型很容易修改以适应诸如行人穿着遮挡四肢的衣物等情况。参数的统计信息是从高质量(室内实验室)数据中学习得到的,然后贝叶斯框架使我们能够在杂乱的室外场景中典型的有噪声图像上“引导”出准确的步态提取。为了实现自动拟合,我们使用隐马尔可夫模型来检测步行周期中图像的相位。我们在高质量室内和有噪声室外条件下从行人的正平行(“侧面”)序列中提取的轮廓上展示了我们的方法。除了添加了合成噪声和遮挡的高质量数据外,我们还测试了背着背包、穿着裙子和风衣的行人。结果根据倒角距离以及自动提取的身体点与相应手动标注点之间的平均像素误差进行量化。系统的任何一个部分本身都不是新颖的,但总体框架使得从比以往质量差得多的图像序列中提取步态成为可能。通过使用我们的方法和一种成熟的基线识别算法比较基于步态的人员识别来证实了这一点。