结合长短时记忆网络和注意力机制的排球视频智能描述技术研究。

Research on Volleyball Video Intelligent Description Technology Combining the Long-Term and Short-Term Memory Network and Attention Mechanism.

机构信息

Guangzhou Sport University, Guangzhou, Guangdong 510500, China.

Guangdong Baiyun University, Guangzhou, Guangdong 510450, China.

出版信息

Comput Intell Neurosci. 2021 Oct 14;2021:7088837. doi: 10.1155/2021/7088837. eCollection 2021.

DOI:10.1155/2021/7088837

PMID:34691171

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8531798/

Abstract

With the development of computer technology, video description, which combines the key technologies in the field of natural language processing and computer vision, has attracted more and more researchers' attention. Among them, how to objectively and efficiently describe high-speed and detailed sports videos is the key to the development of the video description field. In view of the problems of sentence errors and loss of visual information in the generation of the video description text due to the lack of language learning information in the existing video description methods, a multihead model combining the long-term and short-term memory network and attention mechanism is proposed for the intelligent description of the volleyball video. Through the introduction of the attention mechanism, the model pays much attention to the significant areas in the video when generating sentences. Through the comparative experiment with different models, the results show that the model with the attention mechanism can effectively solve the loss of visual information. Compared with the LSTM and base model, the multihead model proposed in this paper, which combines the long-term and short-term memory network and attention mechanism, has higher scores in all evaluation indexes and significantly improved the quality of the intelligent text description of the volleyball video.

摘要

随着计算机技术的发展，视频描述作为自然语言处理和计算机视觉领域的关键技术结合体，越来越受到研究人员的关注。其中，如何客观、高效地描述高速、细节丰富的体育视频是视频描述领域发展的关键。针对现有视频描述方法中缺乏语言学习信息导致视频描述文本生成中存在句子错误和视觉信息丢失的问题，针对排球视频的智能描述，提出了一种结合长短时记忆网络和注意力机制的多头模型。通过引入注意力机制，该模型在生成句子时会更加关注视频中的显著区域。通过与不同模型的对比实验，结果表明，带有注意力机制的模型可以有效地解决视觉信息丢失的问题。与 LSTM 和基础模型相比，本文提出的结合长短时记忆网络和注意力机制的多头模型在所有评价指标上的得分都更高，显著提高了排球视频智能文本描述的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8c2/8531798/0742a57c9f5c/CIN2021-7088837.001.jpg

相似文献

Research on Volleyball Video Intelligent Description Technology Combining the Long-Term and Short-Term Memory Network and Attention Mechanism.

Comput Intell Neurosci. 2021 Oct 14;2021:7088837. doi: 10.1155/2021/7088837. eCollection 2021.

Analysis of Volleyball Video Intelligent Description Technology Based on Computer Memory Network and Attention Mechanism.

Comput Intell Neurosci. 2021 Dec 28;2021:7976888. doi: 10.1155/2021/7976888. eCollection 2021.

A Study of Two-Way Short- and Long-Term Memory Network Intelligent Computing IoT Model-Assisted Home Education Attention Mechanism.

Comput Intell Neurosci. 2021 Dec 21;2021:3587884. doi: 10.1155/2021/3587884. eCollection 2021.

Volleyball training video classification description using the BiLSTM fusion attention mechanism.

Heliyon. 2024 Jul 16;10(15):e34735. doi: 10.1016/j.heliyon.2024.e34735. eCollection 2024 Aug 15.

Intelligent auxiliary system for music performance under edge computing and long short-term recurrent neural networks.

PLoS One. 2023 May 8;18(5):e0285496. doi: 10.1371/journal.pone.0285496. eCollection 2023.

Video captioning based on vision transformer and reinforcement learning.

PeerJ Comput Sci. 2022 Mar 16;8:e916. doi: 10.7717/peerj-cs.916. eCollection 2022.

Application of LSTM Neural Network Technology Embedded in English Intelligent Translation.

Comput Intell Neurosci. 2022 Sep 27;2022:1085577. doi: 10.1155/2022/1085577. eCollection 2022.

Chinese Image Caption Generation via Visual Attention and Topic Modeling.

IEEE Trans Cybern. 2022 Feb;52(2):1247-1257. doi: 10.1109/TCYB.2020.2997034. Epub 2022 Feb 16.

Language Processing Model Construction and Simulation Based on Hybrid CNN and LSTM.

Comput Intell Neurosci. 2021 Jul 6;2021:2578422. doi: 10.1155/2021/2578422. eCollection 2021.

Convolutional neural network-based recognition method for volleyball movements.

Heliyon. 2023 Jul 12;9(8):e18124. doi: 10.1016/j.heliyon.2023.e18124. eCollection 2023 Aug.

引用本文的文献

Volleyball training video classification description using the BiLSTM fusion attention mechanism.

Heliyon. 2024 Jul 16;10(15):e34735. doi: 10.1016/j.heliyon.2024.e34735. eCollection 2024 Aug 15.

本文引用的文献

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.

IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.

LSTM: A Search Space Odyssey.

IEEE Trans Neural Netw Learn Syst. 2017 Oct;28(10):2222-2232. doi: 10.1109/TNNLS.2016.2582924. Epub 2016 Jul 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

结合长短时记忆网络和注意力机制的排球视频智能描述技术研究。

Research on Volleyball Video Intelligent Description Technology Combining the Long-Term and Short-Term Memory Network and Attention Mechanism.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献