• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于分布一致性学习的深度注意力视频摘要

Deep Attentive Video Summarization With Distribution Consistency Learning.

作者信息

Ji Zhong, Zhao Yuxiao, Pang Yanwei, Li Xi, Han Jungong

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1765-1775. doi: 10.1109/TNNLS.2020.2991083. Epub 2021 Apr 2.

DOI:10.1109/TNNLS.2020.2991083
PMID:32396107
Abstract

This article studies supervised video summarization by formulating it into a sequence-to-sequence learning framework, in which the input and output are sequences of original video frames and their predicted importance scores, respectively. Two critical issues are addressed in this article: short-term contextual attention insufficiency and distribution inconsistency. The former lies in the insufficiency of capturing the short-term contextual attention information within the video sequence itself since the existing approaches focus a lot on the long-term encoder-decoder attention. The latter refers to the distributions of predicted importance score sequence and the ground-truth sequence is inconsistent, which may lead to a suboptimal solution. To better mitigate the first issue, we incorporate a self-attention mechanism in the encoder to highlight the important keyframes in a short-term context. The proposed approach alongside the encoder-decoder attention constitutes our deep attentive models for video summarization. For the second one, we propose a distribution consistency learning method by employing a simple yet effective regularization loss term, which seeks a consistent distribution for the two sequences. Our final approach is dubbed as Attentive and Distribution consistent video Summarization (ADSum). Extensive experiments on benchmark data sets demonstrate the superiority of the proposed ADSum approach against state-of-the-art approaches.

摘要

本文通过将监督视频摘要问题构建为一个序列到序列的学习框架来进行研究,在该框架中,输入和输出分别是原始视频帧序列及其预测的重要性得分。本文解决了两个关键问题:短期上下文注意力不足和分布不一致。前者在于现有方法过多地关注长期编码器 - 解码器注意力,从而在视频序列本身内捕捉短期上下文注意力信息方面存在不足。后者指的是预测的重要性得分序列与真实序列的分布不一致,这可能导致次优解。为了更好地缓解第一个问题,我们在编码器中引入了自注意力机制,以在短期上下文中突出重要关键帧。所提出的方法与编码器 - 解码器注意力一起构成了我们用于视频摘要的深度注意力模型。对于第二个问题,我们通过采用一个简单而有效的正则化损失项提出了一种分布一致性学习方法,该方法寻求两个序列的一致分布。我们的最终方法被称为注意力与分布一致的视频摘要(ADSum)。在基准数据集上进行的大量实验证明了所提出的ADSum方法相对于现有方法的优越性。

相似文献

1
Deep Attentive Video Summarization With Distribution Consistency Learning.基于分布一致性学习的深度注意力视频摘要
IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1765-1775. doi: 10.1109/TNNLS.2020.2991083. Epub 2021 Apr 2.
2
Unsupervised Video Summarization Based on Deep Reinforcement Learning with Interpolation.基于深度强化学习与插值的无监督视频摘要。
Sensors (Basel). 2023 Mar 23;23(7):3384. doi: 10.3390/s23073384.
3
Multimodal Abstractive Summarization using bidirectional encoder representations from transformers with attention mechanism.使用带有注意力机制的变换器双向编码器表示的多模态抽象摘要
Heliyon. 2024 Feb 18;10(4):e26162. doi: 10.1016/j.heliyon.2024.e26162. eCollection 2024 Feb 29.
4
Video Summarization With Spatiotemporal Vision Transformer.基于时空视觉Transformer 的视频摘要
IEEE Trans Image Process. 2023;32:3013-3026. doi: 10.1109/TIP.2023.3275069. Epub 2023 May 26.
5
Graph Convolutional Dictionary Selection With L₂ₚ Norm for Video Summarization.用于视频摘要的具有L₂ₚ范数的图卷积字典选择
IEEE Trans Image Process. 2022;31:1789-1804. doi: 10.1109/TIP.2022.3146012. Epub 2022 Feb 10.
6
AudioVisual Video Summarization.视听视频摘要
IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):5181-5188. doi: 10.1109/TNNLS.2021.3119969. Epub 2023 Aug 4.
7
A Video Summarization Model Based on Deep Reinforcement Learning with Long-Term Dependency.基于深度强化学习和长期依赖的视频摘要模型。
Sensors (Basel). 2022 Oct 10;22(19):7689. doi: 10.3390/s22197689.
8
Interp-SUM: Unsupervised Video Summarization with Piecewise Linear Interpolation.Interp-SUM:基于分段线性插值的无监督视频摘要。
Sensors (Basel). 2021 Jul 2;21(13):4562. doi: 10.3390/s21134562.
9
Query-Oriented Micro-Video Summarization.面向查询的微视频摘要
IEEE Trans Pattern Anal Mach Intell. 2024 Jun;46(6):4174-4187. doi: 10.1109/TPAMI.2024.3355402. Epub 2024 May 7.
10
Summarization With Self-Aware Context Selecting Mechanism.带有自我感知上下文选择机制的总结。
IEEE Trans Cybern. 2022 Jul;52(7):5828-5841. doi: 10.1109/TCYB.2020.3042230. Epub 2022 Jul 4.

引用本文的文献

1
An effective Key Frame Extraction technique based on Feature Fusion and Fuzzy-C means clustering with Artificial Hummingbird.一种基于特征融合、模糊C均值聚类和人工蜂鸟的有效关键帧提取技术。
Sci Rep. 2024 Nov 4;14(1):26651. doi: 10.1038/s41598-024-75923-y.
2
Video summarization using deep learning techniques: a detailed analysis and investigation.使用深度学习技术的视频摘要:详细分析与研究
Artif Intell Rev. 2023 Mar 15:1-39. doi: 10.1007/s10462-023-10444-0.
3
Smart brain tumor diagnosis system utilizing deep convolutional neural networks.
利用深度卷积神经网络的智能脑肿瘤诊断系统
Multimed Tools Appl. 2023 Apr 28:1-27. doi: 10.1007/s11042-023-15422-w.
4
A Hierarchical Spatial-Temporal Cross-Attention Scheme for Video Summarization Using Contrastive Learning.一种基于对比学习的视频摘要分层时空交叉注意力方案。
Sensors (Basel). 2022 Oct 28;22(21):8275. doi: 10.3390/s22218275.
5
Self-Supervised Learning to Detect Key Frames in Videos.自监督学习在视频关键帧检测中的应用
Sensors (Basel). 2020 Dec 4;20(23):6941. doi: 10.3390/s20236941.