• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

非修剪视频中复杂事件分析的语义池化。

Semantic Pooling for Complex Event Analysis in Untrimmed Videos.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2017 Aug;39(8):1617-1632. doi: 10.1109/TPAMI.2016.2608901. Epub 2016 Sep 13.

DOI:10.1109/TPAMI.2016.2608901
PMID:28113653
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5570670/
Abstract

Pooling plays an important role in generating a discriminative video representation. In this paper, we propose a new semantic pooling approach for challenging event analysis tasks (e.g., event detection, recognition, and recounting) in long untrimmed Internet videos, especially when only a few shots/segments are relevant to the event of interest while many other shots are irrelevant or even misleading. The commonly adopted pooling strategies aggregate the shots indifferently in one way or another, resulting in a great loss of information. Instead, in this work we first define a novel notion of semantic saliency that assesses the relevance of each shot with the event of interest. We then prioritize the shots according to their saliency scores since shots that are semantically more salient are expected to contribute more to the final event analysis. Next, we propose a new isotonic regularizer that is able to exploit the constructed semantic ordering information. The resulting nearly-isotonic support vector machine classifier exhibits higher discriminative power in event analysis tasks. Computationally, we develop an efficient implementation using the proximal gradient algorithm, and we prove new and closed-form proximal steps. We conduct extensive experiments on three real-world video datasets and achieve promising improvements.

摘要

池化在生成判别性视频表示方面起着重要作用。在本文中,我们提出了一种新的语义池化方法,用于处理具有挑战性的事件分析任务(例如,事件检测、识别和重述),特别是当只有少数几个镜头/片段与感兴趣的事件相关,而许多其他镜头是不相关的甚至是误导性的。通常采用的池化策略以一种或另一种方式不加区分地聚合镜头,导致信息大量丢失。相反,在这项工作中,我们首先定义了一种新的语义显着性概念,用于评估每个镜头与感兴趣事件的相关性。然后,我们根据它们的显着性得分对镜头进行优先级排序,因为语义上更显着的镜头有望对最终的事件分析做出更大的贡献。接下来,我们提出了一种新的保序正则化器,能够利用构建的语义排序信息。由此产生的近保序支持向量机分类器在事件分析任务中表现出更高的判别能力。在计算方面,我们使用近端梯度算法开发了一种高效的实现,并证明了新的闭式近端步骤。我们在三个真实视频数据集上进行了广泛的实验,取得了有希望的改进。

相似文献

1
Semantic Pooling for Complex Event Analysis in Untrimmed Videos.非修剪视频中复杂事件分析的语义池化。
IEEE Trans Pattern Anal Mach Intell. 2017 Aug;39(8):1617-1632. doi: 10.1109/TPAMI.2016.2608901. Epub 2016 Sep 13.
2
Keyframe extraction from laparoscopic videos based on visual saliency detection.基于视觉显著性检测的腹腔镜视频关键帧提取。
Comput Methods Programs Biomed. 2018 Oct;165:13-23. doi: 10.1016/j.cmpb.2018.07.004. Epub 2018 Jul 18.
3
Visual event recognition in videos by learning from Web data.从网络数据中学习的视频中视觉事件识别。
IEEE Trans Pattern Anal Mach Intell. 2012 Sep;34(9):1667-80. doi: 10.1109/TPAMI.2011.265.
4
Submodular Attribute Selection for Visual Recognition.用于视觉识别的次模属性选择。
IEEE Trans Pattern Anal Mach Intell. 2017 Nov;39(11):2242-2255. doi: 10.1109/TPAMI.2016.2636827. Epub 2016 Dec 7.
5
Close Human Interaction Recognition Using Patch-Aware Models.基于补丁感知模型的近距人类交互识别
IEEE Trans Image Process. 2016 Jan;25(1):167-78. doi: 10.1109/TIP.2015.2498410. Epub 2015 Nov 5.
6
Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions.发现运动基元,用于人类动作、手势和表情的无监督分组和一次性学习。
IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1635-48. doi: 10.1109/TPAMI.2012.253.
7
A semantic autonomous video surveillance system for dense camera networks in Smart Cities.一种用于智慧城市中密集型摄像机网络的语义自主视频监控系统。
Sensors (Basel). 2012;12(8):10407-29. doi: 10.3390/s120810407. Epub 2012 Aug 2.
8
Classification approach for automatic laparoscopic video database organization.用于自动腹腔镜视频数据库组织的分类方法。
Int J Comput Assist Radiol Surg. 2015 Sep;10(9):1449-60. doi: 10.1007/s11548-015-1183-4. Epub 2015 Apr 7.
9
Deep Attention Network for Egocentric Action Recognition.基于深度注意力网络的自我中心动作识别。
IEEE Trans Image Process. 2019 Aug;28(8):3703-3713. doi: 10.1109/TIP.2019.2901707. Epub 2019 Feb 26.
10
Explicit modeling of human-object interactions in realistic videos.真实视频中人类-物体交互的显式建模。
IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):835-48. doi: 10.1109/TPAMI.2012.175.

引用本文的文献

1
A new approach of anomaly detection in shopping center surveillance videos for theft prevention based on RLCNN model.一种基于RLCNN模型的购物中心监控视频中用于防盗的异常检测新方法。
PeerJ Comput Sci. 2025 Jun 18;11:e2944. doi: 10.7717/peerj-cs.2944. eCollection 2025.
2
A performance comparison of supervised machine learning models for Covid-19 tweets sentiment analysis.监督机器学习模型在新冠病毒推文情感分析中的性能比较。
PLoS One. 2021 Feb 25;16(2):e0245909. doi: 10.1371/journal.pone.0245909. eCollection 2021.
3
Open-Environment Robotic Acoustic Perception for Object Recognition.

本文引用的文献

1
Video2vec Embeddings Recognize Events When Examples Are Scarce.Video2vec 嵌入识别在例子稀缺时的事件。
IEEE Trans Pattern Anal Mach Intell. 2017 Oct;39(10):2089-2103. doi: 10.1109/TPAMI.2016.2627563. Epub 2016 Nov 10.
2
Order Preserving Sparse Coding.有序保持稀疏编码。
IEEE Trans Pattern Anal Mach Intell. 2015 Aug;37(8):1615-28. doi: 10.1109/TPAMI.2014.2362935.
3
Knowledge Adaptation with Partially Shared Features for Event Detection Using Few Exemplars.基于部分共享特征的知识自适应,使用少量样本进行事件检测。
用于目标识别的开放环境机器人声学感知
Front Neurorobot. 2019 Nov 22;13:96. doi: 10.3389/fnbot.2019.00096. eCollection 2019.
4
Vision-Based Traffic Sign Detection and Recognition Systems: Current Trends and Challenges.基于视觉的交通标志检测与识别系统:当前趋势与挑战
Sensors (Basel). 2019 May 6;19(9):2093. doi: 10.3390/s19092093.
5
Vision-Based Robot Navigation through Combining Unsupervised Learning and Hierarchical Reinforcement Learning.基于视觉的机器人导航,通过结合无监督学习和分层强化学习。
Sensors (Basel). 2019 Apr 1;19(7):1576. doi: 10.3390/s19071576.
6
Low-Rank Graph-Regularized Structured Sparse Regression for Identifying Genetic Biomarkers.用于识别遗传生物标志物的低秩图正则化结构化稀疏回归
IEEE Trans Big Data. 2017 Oct-Dec;3(4):405-414. doi: 10.1109/TBDATA.2017.2735991. Epub 2017 Aug 4.
IEEE Trans Pattern Anal Mach Intell. 2014 Sep;36(9):1789-802. doi: 10.1109/TPAMI.2014.2306419.
4
Feature Grouping and Selection Over an Undirected Graph.无向图上的特征分组与选择
KDD. 2012:922-930. doi: 10.1145/2339530.2339675.
5
Visual event recognition in videos by learning from Web data.从网络数据中学习的视频中视觉事件识别。
IEEE Trans Pattern Anal Mach Intell. 2012 Sep;34(9):1667-80. doi: 10.1109/TPAMI.2011.265.
6
A multimedia retrieval framework based on semi-supervised ranking and relevance feedback.基于半监督排序和相关性反馈的多媒体检索框架。
IEEE Trans Pattern Anal Mach Intell. 2012 Apr;34(4):723-42. doi: 10.1109/TPAMI.2011.170.
7
Learning the parts of objects by non-negative matrix factorization.通过非负矩阵分解学习物体的各个部分。
Nature. 1999 Oct 21;401(6755):788-91. doi: 10.1038/44565.
8
Shifts in selective visual attention: towards the underlying neural circuitry.选择性视觉注意力的转移:朝向潜在神经回路
Hum Neurobiol. 1985;4(4):219-27.