TL-CStrans Net：一种通过CS-Transformer驱动的用于乒乓球运动员动作识别的视觉机器人。

TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer.

作者信息

Ma Libo, Tong Yan

机构信息

Guangdong Polytechnic of Environmental Protection Engineering, Foshan, China.

Hunan Labor and Human Resources Vocational College, Changsha, China.

出版信息

Front Neurorobot. 2024 Oct 21;18:1443177. doi: 10.3389/fnbot.2024.1443177. eCollection 2024.

DOI:10.3389/fnbot.2024.1443177

PMID:39498235

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11532032/

Abstract

Currently, the application of robotics technology in sports training and competitions is rapidly increasing. Traditional methods mainly rely on image or video data, neglecting the effective utilization of textual information. To address this issue, we propose: TL-CStrans Net: A vision robot for table tennis player action recognition driven via CS-Transformer. This is a multimodal approach that combines CS-Transformer, CLIP, and transfer learning techniques to effectively integrate visual and textual information. Firstly, we employ the CS-Transformer model as the neural computing backbone. By utilizing the CS-Transformer, we can effectively process visual information extracted from table tennis game scenes, enabling accurate stroke recognition. Then, we introduce the CLIP model, which combines computer vision and natural language processing. CLIP allows us to jointly learn representations of images and text, thereby aligning the visual and textual modalities. Finally, to reduce training and computational requirements, we leverage pre-trained CS-Transformer and CLIP models through transfer learning, which have already acquired knowledge from relevant domains, and apply them to table tennis stroke recognition tasks. Experimental results demonstrate the outstanding performance of TL-CStrans Net in table tennis stroke recognition. Our research is of significant importance in promoting the application of multimodal robotics technology in the field of sports and bridging the gap between neural computing, computer vision, and neuroscience.

摘要

目前，机器人技术在体育训练和比赛中的应用正在迅速增加。传统方法主要依赖图像或视频数据，而忽视了文本信息的有效利用。为了解决这个问题，我们提出：TL-CStrans Net：一种通过CS-Transformer驱动的用于乒乓球运动员动作识别的视觉机器人。这是一种多模态方法，结合了CS-Transformer、CLIP和迁移学习技术，以有效地整合视觉和文本信息。首先，我们采用CS-Transformer模型作为神经计算主干。通过利用CS-Transformer，我们可以有效地处理从乒乓球比赛场景中提取的视觉信息，实现准确的击球识别。然后，我们引入CLIP模型，它结合了计算机视觉和自然语言处理。CLIP使我们能够联合学习图像和文本的表示，从而对齐视觉和文本模态。最后，为了降低训练和计算需求，我们通过迁移学习利用预训练的CS-Transformer和CLIP模型，这些模型已经从相关领域获得了知识，并将它们应用于乒乓球击球识别任务。实验结果证明了TL-CStrans Net在乒乓球击球识别中的出色性能。我们的研究对于推动多模态机器人技术在体育领域的应用以及弥合神经计算、计算机视觉和神经科学之间的差距具有重要意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d09e/11532032/b119086b9efd/fnbot-18-1443177-g0001.jpg

相似文献

TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer.TL-CStrans Net：一种通过CS-Transformer驱动的用于乒乓球运动员动作识别的视觉机器人。

Front Neurorobot. 2024 Oct 21;18:1443177. doi: 10.3389/fnbot.2024.1443177. eCollection 2024.

Sports competition tactical analysis model of cross-modal transfer learning intelligent robot based on Swin Transformer and CLIP.基于Swin Transformer和CLIP的跨模态迁移学习智能机器人体育竞赛战术分析模型

Front Neurorobot. 2023 Oct 30;17:1275645. doi: 10.3389/fnbot.2023.1275645. eCollection 2023.

Swimtrans Net: a multimodal robotic system for swimming action recognition driven via Swin-Transformer.Swimtrans网络：一种通过Swin Transformer驱动的用于游泳动作识别的多模态机器人系统。

Front Neurorobot. 2024 Sep 24;18:1452019. doi: 10.3389/fnbot.2024.1452019. eCollection 2024.

RL-CWtrans Net: multimodal swimming coaching driven via robot vision.RL-CWtrans网络：基于机器人视觉驱动的多模态游泳训练指导

Front Neurorobot. 2024 Aug 14;18:1439188. doi: 10.3389/fnbot.2024.1439188. eCollection 2024.

Sports-ACtrans Net: research on multimodal robotic sports action recognition driven via ST-GCN.Sports-ACtrans网络：基于时空图卷积网络驱动的多模态机器人运动动作识别研究

Front Neurorobot. 2024 Oct 11;18:1443432. doi: 10.3389/fnbot.2024.1443432. eCollection 2024.

What Does a Language-And-Vision Transformer See: The Impact of Semantic Information on Visual Representations.语言与视觉Transformer看到了什么：语义信息对视觉表征的影响。

Front Artif Intell. 2021 Dec 3;4:767971. doi: 10.3389/frai.2021.767971. eCollection 2021.

CAM-Vtrans: real-time sports training utilizing multi-modal robot data.CAM-Vtrans：利用多模态机器人数据的实时运动训练

Front Neurorobot. 2024 Oct 11;18:1453571. doi: 10.3389/fnbot.2024.1453571. eCollection 2024.

A visual transformer-based smart textual extraction method for financial invoices.一种基于视觉变换器的财务发票智能文本提取方法。

Math Biosci Eng. 2023 Oct 7;20(10):18630-18649. doi: 10.3934/mbe.2023826.

SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.SwinCross：用于 PET/CT 图像中头颈部肿瘤分割的跨模态 Swin 变换器。

Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.

Application of Table Tennis Ball Trajectory and Rotation-Oriented Prediction Algorithm Using Artificial Intelligence.基于人工智能的乒乓球轨迹与旋转预测算法的应用

Front Neurorobot. 2022 May 11;16:820028. doi: 10.3389/fnbot.2022.820028. eCollection 2022.

引用本文的文献

Analysis of baseball behavior recognition model based on Dual-GCN improved by motion weights.基于运动权重改进的双图卷积网络的棒球行为识别模型分析

Sci Rep. 2025 Jul 15;15(1):25588. doi: 10.1038/s41598-025-10681-z.

An improved graph factorization machine based on solving unbalanced game perception.一种基于解决不平衡博弈感知的改进型图分解机

Front Neurorobot. 2024 Dec 4;18:1481297. doi: 10.3389/fnbot.2024.1481297. eCollection 2024.

本文引用的文献

Multimodal audio-visual robot fusing 3D CNN and CRNN for player behavior recognition and prediction in basketball matches.融合3D卷积神经网络和卷积循环神经网络的多模态视听机器人用于篮球比赛中球员行为的识别与预测

Front Neurorobot. 2024 Mar 6;18:1284175. doi: 10.3389/fnbot.2024.1284175. eCollection 2024.

3D network with channel excitation and knowledge distillation for action recognition.用于动作识别的具有通道激励和知识蒸馏的3D网络。

Front Neurorobot. 2023 Mar 23;17:1050167. doi: 10.3389/fnbot.2023.1050167. eCollection 2023.

Gesture Recognition in Robotic Surgery With Multimodal Attention.机器人手术中的多模态注意力手势识别。

IEEE Trans Med Imaging. 2022 Jul;41(7):1677-1687. doi: 10.1109/TMI.2022.3147640. Epub 2022 Jun 30.

Hand Gesture Recognition based on Surface Electromyography using Convolutional Neural Network with Transfer Learning Method.基于卷积神经网络的迁移学习方法的表面肌电手势识别。

IEEE J Biomed Health Inform. 2021 Apr;25(4):1292-1304. doi: 10.1109/JBHI.2020.3009383. Epub 2021 Apr 6.

LSTM-Guided Coaching Assistant for Table Tennis Practice.LSTM 引导式乒乓球练习教练助手。

Sensors (Basel). 2018 Nov 23;18(12):4112. doi: 10.3390/s18124112.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

TL-CStrans Net：一种通过CS-Transformer驱动的用于乒乓球运动员动作识别的视觉机器人。

TL-CStrans Net: a vision robot for table tennis player action recognition driven via CS-Transformer.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献