Suppr超能文献

基于迁移学习的体育视频摘要场景分类。

Scene Classification for Sports Video Summarization Using Transfer Learning.

机构信息

Department of Information and Communication Engineering, Yeungnam University, Gyeongsan-si 38541, Korea.

The division of computer convergence, Chungnam National University, Daejeon 34134, Korea.

出版信息

Sensors (Basel). 2020 Mar 18;20(6):1702. doi: 10.3390/s20061702.

Abstract

This paper proposes a novel method for sports video scene classification with the particular intention of video summarization. Creating and publishing a shorter version of the video is more interesting than a full version due to instant entertainment. Generating shorter summaries of the videos is a tedious task that requires significant labor hours and unnecessary machine occupation. Due to the growing demand for video summarization in marketing, advertising agencies, awareness videos, documentaries, and other interest groups, researchers are continuously proposing automation frameworks and novel schemes. Since the scene classification is a fundamental component of video summarization and video analysis, the quality of scene classification is particularly important. This article focuses on various practical implementation gaps over the existing techniques and presents a method to achieve high-quality of scene classification. We consider cricket as a case study and classify five scene categories, i.e., batting, bowling, boundary, crowd and close-up. We employ our model using pre-trained AlexNet Convolutional Neural Network (CNN) for scene classification. The proposed method employs new, fully connected layers in an encoder fashion. We employ data augmentation to achieve a high accuracy of 99.26% over a smaller dataset. We conduct a performance comparison against baseline approaches to prove the superiority of the method as well as state-of-the-art models. We evaluate our performance results on cricket videos and compare various deep-learning models, i.e., Inception V3, Visual Geometry Group (VGGNet16, VGGNet19) , Residual Network (ResNet50), and AlexNet. Our experiments demonstrate that our method with AlexNet CNN produces better results than existing proposals.

摘要

本文提出了一种新的运动视频场景分类方法,特别是出于视频摘要的目的。由于即时娱乐,创建和发布视频的较短版本比完整版本更有趣。由于视频摘要在营销、广告代理、意识视频、纪录片和其他利益群体中的需求不断增长,研究人员不断提出自动化框架和新方案。由于场景分类是视频摘要和视频分析的基本组成部分,因此场景分类的质量尤为重要。本文重点介绍了现有技术中的各种实际实施差距,并提出了一种实现高质量场景分类的方法。我们以板球为例,将五个场景类别进行分类,即击球、投球、边界、人群和特写。我们使用预先训练的 AlexNet 卷积神经网络 (CNN) 对场景进行分类。所提出的方法采用新的、全连接层的编码器风格。我们采用数据增强来实现较小数据集上 99.26%的高精度。我们对基准方法进行了性能比较,以证明该方法以及最先进模型的优越性。我们在板球视频上评估了我们的性能结果,并比较了各种深度学习模型,即 Inception V3、视觉几何组 (VGGNet16、VGGNet19)、残差网络 (ResNet50) 和 AlexNet。我们的实验表明,我们的方法使用 AlexNet CNN 产生的结果优于现有提案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cef2/7146586/e83cb9cb2221/sensors-20-01702-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验