ViRVIG, Universitat Politècnica de Catalunya-BarcelonaTech, Pau Gargallo 14, CS Dept, Edifici U, 08028 Barcelona, Spain.
ViRVIG, Universitat Politècnica de Catalunya-BarcelonaTech, Jordi Girona 1-3, CS Dept, Edifici Omega, 08034 Barcelona, Spain.
Sensors (Basel). 2021 May 12;21(10):3368. doi: 10.3390/s21103368.
The estimation of player positions is key for performance analysis in sport. In this paper, we focus on image-based, single-angle, player position estimation in padel. Unlike tennis, the primary camera view in professional padel videos follows a de facto standard, consisting of a high-angle shot at about 7.6 m above the court floor. This camera angle reduces the occlusion impact of the mesh that stands over the glass walls, and offers a convenient view for judging the depth of the ball and the player positions and poses. We evaluate and compare the accuracy of state-of-the-art computer vision methods on a large set of images from both amateur videos and publicly available videos from the major international padel circuit. The methods we analyze include object detection, image segmentation and pose estimation techniques, all of them based on deep convolutional neural networks. We report accuracy and average precision with respect to manually-annotated video frames. The best results are obtained by top-down pose estimation methods, which offer a detection rate of 99.8% and a RMSE below 5 and 12 cm for horizontal/vertical court-space coordinates (deviations from predicted and ground-truth player positions). These results demonstrate the suitability of pose estimation methods based on deep convolutional neural networks for estimating player positions from single-angle padel videos. Immediate applications of this work include the player and team analysis of the large collection of publicly available videos from international circuits, as well as an inexpensive method to get player positional data in amateur padel clubs.
球员位置的估计是运动表现分析的关键。在本文中,我们专注于基于图像的、单视角的壁球球员位置估计。与网球不同,职业壁球视频中的主要摄像机视角遵循事实上的标准,包括一个高于球场地面约 7.6 米的高角度拍摄。这种摄像机角度减少了网的遮挡影响,为判断球的深度以及球员的位置和姿势提供了便利的视角。我们在一组来自业余视频和主要国际壁球巡回赛的公开可用视频的大量图像上评估和比较了最先进的计算机视觉方法的准确性。我们分析的方法包括目标检测、图像分割和姿势估计技术,所有这些都基于深度卷积神经网络。我们报告了相对于手动注释视频帧的准确性和平均精度。自上而下的姿势估计方法的结果最好,其检测率为 99.8%,水平/垂直球场空间坐标的 RMSE 低于 5 和 12 厘米(预测和地面真实球员位置的偏差)。这些结果表明,基于深度卷积神经网络的姿势估计方法适用于从单视角壁球视频中估计球员位置。这项工作的直接应用包括对国际巡回赛上大量公开可用视频的球员和团队分析,以及在业余壁球俱乐部中获取球员位置数据的廉价方法。