• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于弱监督视频异常检测的多模态和多尺度特征融合

Multimodal and multiscale feature fusion for weakly supervised video anomaly detection.

作者信息

Sun Wenwen, Cao Lin, Guo Yanan, Du Kangning

机构信息

Key Laboratory of the Ministry of Education for Optoelectronic Measurement Technology and Instrument, Beijing Information Science and Technology University, Beijing, 100192, China.

School of Instrument Science and Opto-Electronics Engineering, Beijing Information Science and Technology University, Beijing, 100192, China.

出版信息

Sci Rep. 2024 Oct 1;14(1):22835. doi: 10.1038/s41598-024-73462-0.

DOI:10.1038/s41598-024-73462-0
PMID:39354033
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11445271/
Abstract

Weakly supervised video anomaly detection aims to detect anomalous events with only video-level labels. In the absence of boundary information for anomaly segments, most existing methods rely on multiple instance learning. In these approaches, the predictions for unlabeled video snippets are guided by the classification of labeled untrimmed videos. However, these methods do not account for issues such as video blur and visual occlusion, which can hinder accurate anomaly detection. To address these issues, we propose a novel weakly supervised video anomaly detection method that fuses multimodal and multiscale features. Firstly, RGB and optical flow snippets are input into pre-trained I3D to extract appearance and motion features. Then, we introduce an Attention De-redundancy (AD) module, which employs an attention mechanism to filter out task-irrelevant redundancy in these appearance and motion features. Next, to mitigate the effects of video blurring and visual occlusion, we propose a Multi-scale Feature Learning module. This module captures long-term and short-term temporal dependencies among video snippets to provide global and local guidance for blurred or occluded video snippets. Finally, to effectively utilize the discriminative features of different modalities, we propose an Adaptive Feature Fusion module. This module adaptively fuses appearance and motion features based on their respective feature weights. Extensive experimental results demonstrate that our proposed method outperforms mainstream unsupervised and weakly supervised methods in terms of AUC. Specifically, our proposed method achieves 97.00% AUC and 85.31% AUC on two benchmark datasets, i.e., ShanghaiTech and UCF-Crime, respectively.

摘要

弱监督视频异常检测旨在仅利用视频级标签来检测异常事件。在缺乏异常片段边界信息的情况下,大多数现有方法依赖于多实例学习。在这些方法中,对未标记视频片段的预测由标记的未修剪视频的分类来引导。然而,这些方法没有考虑诸如视频模糊和视觉遮挡等问题,这些问题可能会阻碍准确的异常检测。为了解决这些问题,我们提出了一种融合多模态和多尺度特征的新型弱监督视频异常检测方法。首先,将RGB和光流片段输入到预训练的I3D中以提取外观和运动特征。然后,我们引入了一个注意力去冗余(AD)模块,该模块采用注意力机制来滤除这些外观和运动特征中与任务无关的冗余。接下来,为了减轻视频模糊和视觉遮挡的影响,我们提出了一个多尺度特征学习模块。该模块捕捉视频片段之间的长期和短期时间依赖性,为模糊或遮挡的视频片段提供全局和局部指导。最后,为了有效利用不同模态的判别特征,我们提出了一个自适应特征融合模块。该模块根据各自的特征权重自适应地融合外观和运动特征。大量实验结果表明,我们提出的方法在AUC方面优于主流的无监督和弱监督方法。具体而言,我们提出的方法在两个基准数据集,即上海科技大学数据集和UCF犯罪数据集上分别实现了97.00%的AUC和85.31%的AUC。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/e0df93d93acf/41598_2024_73462_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/18238614b1ec/41598_2024_73462_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/2c090a5ad372/41598_2024_73462_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/14466e172d43/41598_2024_73462_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/8237fa80b89d/41598_2024_73462_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/b34b8a751cb6/41598_2024_73462_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/92283ede4471/41598_2024_73462_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/1681aa000b07/41598_2024_73462_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/dca24fbd6a68/41598_2024_73462_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/e0df93d93acf/41598_2024_73462_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/18238614b1ec/41598_2024_73462_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/2c090a5ad372/41598_2024_73462_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/14466e172d43/41598_2024_73462_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/8237fa80b89d/41598_2024_73462_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/b34b8a751cb6/41598_2024_73462_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/92283ede4471/41598_2024_73462_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/1681aa000b07/41598_2024_73462_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/dca24fbd6a68/41598_2024_73462_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/485a/11445271/e0df93d93acf/41598_2024_73462_Fig9_HTML.jpg

相似文献

1
Multimodal and multiscale feature fusion for weakly supervised video anomaly detection.用于弱监督视频异常检测的多模态和多尺度特征融合
Sci Rep. 2024 Oct 1;14(1):22835. doi: 10.1038/s41598-024-73462-0.
2
Weakly Supervised Video Anomaly Detection via Self-Guided Temporal Discriminative Transformer.基于自引导时间判别变压器的弱监督视频异常检测
IEEE Trans Cybern. 2024 May;54(5):3197-3210. doi: 10.1109/TCYB.2022.3227044. Epub 2024 Apr 16.
3
Learning Prompt-Enhanced Context Features for Weakly-Supervised Video Anomaly Detection.学习用于弱监督视频异常检测的提示增强上下文特征。
IEEE Trans Image Process. 2024;33:4923-4936. doi: 10.1109/TIP.2024.3451935. Epub 2024 Sep 11.
4
Cognitive Refined Augmentation for Video Anomaly Detection in Weak Supervision.弱监督下视频异常检测的认知精炼增强
Sensors (Basel). 2023 Dec 21;24(1):58. doi: 10.3390/s24010058.
5
Distilling Privileged Knowledge for Anomalous Event Detection From Weakly Labeled Videos.从弱标注视频中提炼特权知识用于异常事件检测
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12627-12641. doi: 10.1109/TNNLS.2023.3263966. Epub 2024 Sep 3.
6
Localizing Anomalies From Weakly-Labeled Videos.从弱标注视频中定位异常
IEEE Trans Image Process. 2021;30:4505-4515. doi: 10.1109/TIP.2021.3072863. Epub 2021 Apr 28.
7
Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos.聚类辅助的弱监督训练用于检测监控视频中的异常事件。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14085-14098. doi: 10.1109/TNNLS.2023.3274611. Epub 2024 Oct 7.
8
Ensemble Prototype Network For Weakly Supervised Temporal Action Localization.用于弱监督时间动作定位的集成原型网络
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4560-4574. doi: 10.1109/TNNLS.2024.3377468. Epub 2025 Feb 28.
9
CNN-ViT Supported Weakly-Supervised Video Segment Level Anomaly Detection.基于卷积神经网络-视觉Transformer的弱监督视频片段级异常检测
Sensors (Basel). 2023 Sep 7;23(18):7734. doi: 10.3390/s23187734.
10
Injecting Text Clues for Improving Anomalous Event Detection From Weakly Labeled Videos.注入文本线索以改进从弱标注视频中检测异常事件
IEEE Trans Image Process. 2024;33:5907-5920. doi: 10.1109/TIP.2024.3477351. Epub 2024 Oct 18.

引用本文的文献

1
TOSD: A Hierarchical Object-Centric Descriptor Integrating Shape, Color, and Topology.TOSD:一种集成形状、颜色和拓扑结构的分层对象中心描述符。
Sensors (Basel). 2025 Jul 25;25(15):4614. doi: 10.3390/s25154614.

本文引用的文献

1
Neighbor-Guided Pseudo-Label Generation and Refinement for Single-Frame Supervised Temporal Action Localization.用于单帧监督时域动作定位的邻居引导伪标签生成与优化
IEEE Trans Image Process. 2024;33:2419-2430. doi: 10.1109/TIP.2024.3378477. Epub 2024 Mar 29.
2
Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos.聚类辅助的弱监督训练用于检测监控视频中的异常事件。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14085-14098. doi: 10.1109/TNNLS.2023.3274611. Epub 2024 Oct 7.
3
Video Captioning Using Global-Local Representation.
使用全局-局部表示的视频字幕
IEEE Trans Circuits Syst Video Technol. 2022 Oct;32(10):6642-6656. doi: 10.1109/tcsvt.2022.3177320. Epub 2022 May 23.
4
Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization.用于弱监督时间动作定位的多分层类别监督
IEEE Trans Image Process. 2021;30:9332-9344. doi: 10.1109/TIP.2021.3124671. Epub 2021 Nov 12.
5
Learning Causal Temporal Relation and Feature Discrimination for Anomaly Detection.用于异常检测的因果时间关系学习与特征辨别
IEEE Trans Image Process. 2021;30:3513-3527. doi: 10.1109/TIP.2021.3062192. Epub 2021 Mar 11.