Suppr超能文献

基于双向长短期记忆融合注意力机制的排球训练视频分类描述

Volleyball training video classification description using the BiLSTM fusion attention mechanism.

作者信息

Ruiye Zhao

机构信息

School of Civil Aviation and Transportation, Zhejiang Yuying College of Vocational Technology, Hangzhou, 310018, China.

出版信息

Heliyon. 2024 Jul 16;10(15):e34735. doi: 10.1016/j.heliyon.2024.e34735. eCollection 2024 Aug 15.

Abstract

This study aims to explore methods for classifying and describing volleyball training videos using deep learning techniques. By developing an innovative model that integrates Bi-directional Long Short-Term Memory (BiLSTM) and attention mechanisms, referred to BiLSTM-Multimodal Attention Fusion Temporal Classification (BiLSTM-MAFTC), the study enhances the accuracy and efficiency of volleyball video content analysis. Initially, the model encodes features from various modalities into feature vectors, capturing different types of information such as positional and modal data. The BiLSTM network is then used to model multi-modal temporal information, while spatial and channel attention mechanisms are incorporated to form a dual-attention module. This module establishes correlations between different modality features, extracting valuable information from each modality and uncovering complementary information across modalities. Extensive experiments validate the method's effectiveness and state-of-the-art performance. Compared to conventional recurrent neural network algorithms, the model achieves recognition accuracies exceeding 95 % under Top-1 and Top-5 metrics for action recognition, with a recognition speed of 0.04 s per video. The study demonstrates that the model can effectively process and analyze multimodal temporal information, including athlete movements, positional relationships on the court, and ball trajectories. Consequently, precise classification and description of volleyball training videos are achieved. This advancement significantly enhances the efficiency of coaches and athletes in volleyball training and provides valuable insights for broader sports video analysis research.

摘要

本研究旨在探索使用深度学习技术对排球训练视频进行分类和描述的方法。通过开发一种集成双向长短期记忆(BiLSTM)和注意力机制的创新模型,即BiLSTM-多模态注意力融合时间分类(BiLSTM-MAFTC),该研究提高了排球视频内容分析的准确性和效率。最初,该模型将来自各种模态的特征编码为特征向量,捕获诸如位置和模态数据等不同类型的信息。然后使用BiLSTM网络对多模态时间信息进行建模,同时结合空间和通道注意力机制形成双注意力模块。该模块在不同模态特征之间建立关联,从每个模态中提取有价值的信息,并揭示跨模态的互补信息。大量实验验证了该方法的有效性和领先性能。与传统递归神经网络算法相比,该模型在动作识别的Top-1和Top-5指标下实现了超过95%的识别准确率,每个视频的识别速度为0.04秒。该研究表明,该模型可以有效地处理和分析多模态时间信息,包括运动员动作、场上位置关系和球的轨迹。因此,实现了对排球训练视频的精确分类和描述。这一进展显著提高了排球训练中教练和运动员的效率,并为更广泛的体育视频分析研究提供了有价值的见解。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验