International Research Institute MICA, HUST-CNRS/UMI-2954-GRENOBLE INP, Hanoi University of Science and Technology, Hanoi, Vietnam.
International Research Institute MICA, HUST-CNRS/UMI-2954-GRENOBLE INP, Hanoi University of Science and Technology, Hanoi, Vietnam.
Comput Methods Programs Biomed. 2017 Jul;146:151-165. doi: 10.1016/j.cmpb.2017.05.007. Epub 2017 May 25.
Automatic detection of human fall is a key problem in video surveillance and home monitoring. Existing methods using unimodal data (RGB / depth / skeleton) may suffer from the drawbacks of inadequate lighting condition or unreliability. Besides, most of proposed methods are constrained to a small space with off-line video stream.
In this study, we overcome these encountered issues by combining multi-modal features (skeleton and RGB) from Kinect sensor to take benefits of each data characteristic. If a skeleton is available, we propose a rules based technique on the vertical velocity and the height to floor plane of the human center. Otherwise, we compute a motion map from a continuous gray-scale image sequence, represent it by an improved kernel descriptor then input to a linear Support Vector Machine. This combination speeds up the proposed system and avoid missing detection at an unmeasurable range of the Kinect sensor. We then deploy this method with multiple Kinects to deal with large environments based on client server architecture with late fusion techniques.
We evaluated the method on some freely available datasets for fall detection. Compared to recent methods, our method has a lower false alarm rate while keeping the highest accuracy. We also validated on-line our system using multiple Kinects in a large lab-based environment. Our method obtained an accuracy of 91.5% at average frame-rate of 10fps.
The proposed method using multi-modal features obtained higher results than using unimodal features. Its on-line deployment on multiple Kinects shows the potential to be applied in to any of living space in reality.
人体跌倒的自动检测是视频监控和家庭监护的关键问题。现有的使用单模态数据(RGB/深度/骨骼)的方法可能存在光照条件不足或不可靠的缺点。此外,大多数提出的方法都受到限于离线视频流的小空间。
在这项研究中,我们通过结合 Kinect 传感器的多模态特征(骨骼和 RGB)来克服这些遇到的问题,以利用每种数据的特征。如果有骨骼可用,我们提出了一种基于垂直速度和人体中心到地板平面高度的规则技术。否则,我们从连续灰度图像序列中计算运动图,并用改进的核描述符表示,然后输入到线性支持向量机。这种组合加快了提出的系统的速度,并避免了在 Kinect 传感器不可测量的范围内的漏检。然后,我们基于客户端-服务器架构和后期融合技术,使用多个 Kinect 部署这种方法来处理大环境。
我们在一些免费的跌倒检测数据集上评估了该方法。与最近的方法相比,我们的方法在保持最高准确性的同时具有更低的误报率。我们还在一个基于实验室的大型环境中使用多个 Kinect 在线验证了我们的系统。我们的方法在平均帧率为 10fps 时的准确率为 91.5%。
使用多模态特征的提出的方法比使用单模态特征的方法获得了更高的结果。它在多个 Kinect 上的在线部署显示了在现实中应用于任何生活空间的潜力。