Chen He, Yue Xiaoyu
Department of Physical Education, Sangmyung University, Seoul, Republic of Korea.
Nanjing University of Technology, Nanjing, Jiangsu, China.
Front Neurorobot. 2024 Sep 24;18:1452019. doi: 10.3389/fnbot.2024.1452019. eCollection 2024.
Currently, using machine learning methods for precise analysis and improvement of swimming techniques holds significant research value and application prospects. The existing machine learning methods have improved the accuracy of action recognition to some extent. However, they still face several challenges such as insufficient data feature extraction, limited model generalization ability, and poor real-time performance.
To address these issues, this paper proposes an innovative approach called Swimtrans Net: A multimodal robotic system for swimming action recognition driven via Swin-Transformer. By leveraging the powerful visual data feature extraction capabilities of Swin-Transformer, Swimtrans Net effectively extracts swimming image information. Additionally, to meet the requirements of multimodal tasks, we integrate the CLIP model into the system. Swin-Transformer serves as the image encoder for CLIP, and through fine-tuning the CLIP model, it becomes capable of understanding and interpreting swimming action data, learning relevant features and patterns associated with swimming. Finally, we introduce transfer learning for pre-training to reduce training time and lower computational resources, thereby providing real-time feedback to swimmers.
Experimental results show that Swimtrans Net has achieved a 2.94% improvement over the current state-of-the-art methods in swimming motion analysis and prediction, making significant progress. This study introduces an innovative machine learning method that can help coaches and swimmers better understand and improve swimming techniques, ultimately improving swimming performance.
目前,利用机器学习方法对游泳技术进行精确分析和改进具有重要的研究价值和应用前景。现有的机器学习方法在一定程度上提高了动作识别的准确性。然而,它们仍然面临一些挑战,如数据特征提取不足、模型泛化能力有限和实时性能较差。
为了解决这些问题,本文提出了一种创新方法——Swimtrans Net:一种由Swin-Transformer驱动的用于游泳动作识别的多模态机器人系统。通过利用Swin-Transformer强大的视觉数据特征提取能力,Swimtrans Net有效地提取游泳图像信息。此外,为了满足多模态任务的要求,我们将CLIP模型集成到系统中。Swin-Transformer作为CLIP的图像编码器,并通过微调CLIP模型,使其能够理解和解释游泳动作数据,学习与游泳相关的特征和模式。最后,我们引入迁移学习进行预训练,以减少训练时间和降低计算资源,从而为游泳者提供实时反馈。
实验结果表明,Swimtrans Net在游泳动作分析和预测方面比当前最先进的方法提高了2.94%,取得了显著进展。本研究引入了一种创新的机器学习方法,可以帮助教练和游泳者更好地理解和改进游泳技术,最终提高游泳成绩。