Kovács Lóránt, Bódis Balázs M, Benedek Csaba
HUN-REN Institute for Computer Science and Control (SZTAKI), Kende utca 13-17, H-1111 Budapest, Hungary.
Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter utca 50/A, H-1083 Budapest, Hungary.
Sensors (Basel). 2024 May 26;24(11):3427. doi: 10.3390/s24113427.
In this paper, we propose a novel, vision-transformer-based end-to-end pose estimation method, LidPose, for real-time human skeleton estimation in non-repetitive circular scanning (NRCS) lidar point clouds. Building on the ViTPose architecture, we introduce novel adaptations to address the unique properties of NRCS lidars, namely, the sparsity and unusual rosetta-like scanning pattern. The proposed method addresses a common issue of NRCS lidar-based perception, namely, the sparsity of the measurement, which needs balancing between the spatial and temporal resolution of the recorded data for efficient analysis of various phenomena. LidPose utilizes foreground and background segmentation techniques for the NRCS lidar sensor to select a region of interest (RoI), making LidPose a complete end-to-end approach to moving pedestrian detection and skeleton fitting from raw NRCS lidar measurement sequences captured by a static sensor for surveillance scenarios. To evaluate the method, we have created a novel, real-world, multi-modal dataset, containing camera images and lidar point clouds from a Livox Avia sensor, with annotated 2D and 3D human skeleton ground truth.
在本文中,我们提出了一种新颖的、基于视觉变换器的端到端姿态估计方法LidPose,用于在非重复圆形扫描(NRCS)激光雷达点云中进行实时人体骨架估计。基于ViTPose架构,我们引入了新颖的适配方法来解决NRCS激光雷达的独特特性,即稀疏性和不寻常的类似玫瑰花结的扫描模式。所提出的方法解决了基于NRCS激光雷达感知的一个常见问题,即测量的稀疏性,这需要在记录数据的空间和时间分辨率之间进行平衡,以便对各种现象进行有效分析。LidPose利用NRCS激光雷达传感器的前景和背景分割技术来选择感兴趣区域(RoI),使LidPose成为一种完整的端到端方法,用于从静态传感器捕获的原始NRCS激光雷达测量序列中进行移动行人检测和骨架拟合,适用于监控场景。为了评估该方法,我们创建了一个新颖的、真实世界的多模态数据集,其中包含来自Livox Avia传感器的相机图像和激光雷达点云,并带有注释的2D和3D人体骨架地面真值。