Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340044.
With recent advancements in computer vision as well as machine learning (ML), video-based at-home exercise evaluation systems have become a popular topic of current research. However, performance depends heavily on the amount of available training data. Since labeled datasets specific to exercising are rare, we propose a method that makes use of the abundance of fitness videos available online. Specifically, we utilize the advantage that videos often not only show the exercises, but also provide language as an additional source of information. With push-ups as an example, we show that through the analysis of subtitle data using natural language processing (NLP), it is possible to create a labeled (irrelevant, relevant correct, relevant incorrect) dataset containing relevant information for pose analysis. In particular, we show that irrelevant clips (n = 332) have significantly different joint visibility values compared to relevant clips (n = 298). Inspecting cluster centroids also show different poses for the different classes.
随着计算机视觉和机器学习(ML)的最新进展,基于视频的家庭锻炼评估系统已成为当前研究的热门话题。然而,其性能在很大程度上取决于可用的训练数据量。由于针对锻炼的标记数据集很少,因此我们提出了一种利用大量在线健身视频的方法。具体来说,我们利用视频不仅通常显示锻炼,而且还提供语言作为额外信息源的优势。以俯卧撑为例,我们通过使用自然语言处理(NLP)分析字幕数据,展示了创建包含姿势分析相关信息的标记(不相关、相关正确、相关错误)数据集的可能性。特别是,我们表明,与相关剪辑(n = 298)相比,不相关剪辑(n = 332)的关节可见度值差异显著。检查聚类中心还表明,不同类别的姿势不同。