IEEE Trans Image Process. 2020;29:9689-9702. doi: 10.1109/TIP.2020.3028962. Epub 2020 Oct 28.
Online segmentation and recognition of skeleton- based gestures are challenging. Compared with offline cases, the inference of online settings can only rely on the current few frames and always completes before whole temporal movements are performed. However, incompletely performed gestures are ambiguous and their early recognition is easy to fall into local optimum. In this work, we address the problem with a temporal hierarchical dictionary to guide the hidden Markov model (HMM) decoding procedure. The intuition is that, gestures are ambiguous with high uncertainty at early performing phases, and only become discriminate after certain phases. This uncertainty naturally can be measured by entropy. Thus, we propose a measurement called "relative entropy map" (REM) to encode this temporal context to guide HMM decoding. Furthermore, we introduce a progressive learning strategy with which neural networks could learn a robust recognition of HMM states in an iterative manner. The performance of our method is intensively evaluated on three challenging databases and achieves state-of-the-art results. Our method shows the abilities of both extracting the discriminate connotations and reducing large redundancy in the HMM transition process. It is verified that our framework can achieve online recognition of continuous gesture streams even when they are halfway performed.
基于骨架的手势的在线分割和识别具有挑战性。与离线情况相比,在线设置的推断只能依赖当前的少数几帧,并且必须在整个时间运动完成之前完成。然而,未完全执行的手势是模糊的,它们的早期识别容易陷入局部最优。在这项工作中,我们使用时间层次字典来解决这个问题,以指导隐马尔可夫模型(HMM)解码过程。直觉是,手势在早期执行阶段具有很高的不确定性,并且只有在某些阶段之后才变得有区别。这种不确定性自然可以用熵来衡量。因此,我们提出了一种称为“相对熵图”(REM)的度量方法,以将这种时间上下文编码为指导 HMM 解码。此外,我们引入了一种渐进式学习策略,神经网络可以通过迭代的方式学习到 HMM 状态的鲁棒识别。我们的方法在三个具有挑战性的数据库上进行了密集评估,并取得了最先进的结果。我们的方法展示了在 HMM 转换过程中提取判别内涵和减少大量冗余的能力。验证了即使在手势流执行到一半时,我们的框架也能够实现连续手势流的在线识别。