Arts et Métiers Institue of Technology, LISPEN, HESAM University, 75005 Chalon-sur-Saône, France.
IBISC Laboratory, University of Evry, 91000 Evry-Courcouronnes, France.
Sensors (Basel). 2020 Dec 7;20(23):6985. doi: 10.3390/s20236985.
The purpose of this paper is to investigate the problem of 3D human tracking in complex environments using a particle filter with images captured by a catadioptric vision system. This issue has been widely studied in the literature on RGB images acquired from conventional perspective cameras, while omnidirectional images have seldom been used and published research works in this field remains limited. In this study, the Riemannian varieties was considered in order to compute the gradient on spherical images and generate a robust descriptor used along with an SVM classifier for human detection. Original likelihood functions associated with the particle filter are proposed, using both geodesic distances and overlapping regions between the silhouette detected in the images and the projected 3D human model. Our approach was experimentally evaluated on real data and showed favorable results compared to machine learning based techniques about the 3D pose accuracy. Thus, the Root Mean Square Error (RMSE) was measured by comparing estimated 3D poses and truth data, resulting in a mean error of 0.065 m when walking action was applied.
本文旨在研究使用带有折反射视觉系统拍摄的图像的粒子滤波器在复杂环境中进行 3D 人体跟踪的问题。这个问题在文献中已经广泛研究了从传统透视相机获取的 RGB 图像,而很少使用全景图像,并且该领域的已发表研究工作仍然有限。在这项研究中,考虑了黎曼流形以计算球体图像上的梯度,并生成一个用于与 SVM 分类器一起使用的鲁棒描述符,用于人体检测。提出了与粒子滤波器相关的原始似然函数,使用了图像中检测到的轮廓与投影的 3D 人体模型之间的测地距离和重叠区域。我们的方法在真实数据上进行了实验评估,与基于机器学习的技术相比,在 3D 姿势精度方面取得了较好的结果。因此,通过比较估计的 3D 姿势和真实数据来测量均方根误差 (RMSE),当应用行走动作时,平均误差为 0.065 米。