• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Dark-DSAR:黑暗视频中动作识别的轻量级一站式流水线。

Dark-DSAR: Lightweight one-step pipeline for action recognition in dark videos.

机构信息

The State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China.

The Department of Pediatrics, Renmin Hospital, Wuhan University, Wuhan, China.

出版信息

Neural Netw. 2024 Nov;179:106622. doi: 10.1016/j.neunet.2024.106622. Epub 2024 Aug 8.

DOI:10.1016/j.neunet.2024.106622
PMID:39142175
Abstract

Dark video human action recognition has a wide range of applications in the real world. General action recognition methods focus on the actor or the action itself, ignoring the dark scene where the action happens, resulting in unsatisfied accuracy in recognition. For dark scenes, the existing two-step action recognition methods are stage complex due to introducing additional augmentation steps, and the one-step pipeline method is not lightweight enough. To address these issues, a one-step Transformer-based method named Dark Domain Shift for Action Recognition (Dark-DSAR) is proposed in this paper, which integrates the tasks of domain migration and classification into a single step and enhances the model's functional coherence with respect to these two tasks, making our Dark-DSAR has low computation but high accuracy. Specifically, the domain shift module (DSM) achieves domain adaption from dark to bright to reduce the number of parameters and the computational cost. Besides, we explore the matching relationship between the input video size and the model, which can further optimize the inference efficiency by removing the redundant information in videos through spatial resolution dropping. Extensive experiments have been conducted on the datasets of ARID1.5, HMDB51-Dark, and UAV-human-night. Results show that the proposed Dark-DSAR obtains the best Top-1 accuracy on ARID1.5 with 89.49%, which is 2.56% higher than the state-of-the-art method, 67.13% and 61.9% on HMDB51-Dark and UAV-human-night, respectively. In addition, ablation experiments reveal that the action classifiers can gain ≥1% in accuracy compared to the original model when equipped with our DSM.

摘要

黑暗视频人体动作识别在现实世界中有广泛的应用。一般的动作识别方法侧重于演员或动作本身,忽略了动作发生的黑暗场景,导致识别精度不高。对于黑暗场景,现有的两步动作识别方法由于引入了额外的增强步骤而较为复杂,而一步式流水线方法则不够轻量级。针对这些问题,本文提出了一种基于 Transformer 的一步式方法,名为 Dark Domain Shift for Action Recognition(Dark-DSAR),它将域迁移和分类任务集成到一个步骤中,并增强了模型对这两个任务的功能一致性,使我们的 Dark-DSAR 具有低计算量但高精度。具体来说,域迁移模块(DSM)实现了从黑暗到明亮的域自适应,减少了参数数量和计算成本。此外,我们还探索了输入视频大小与模型之间的匹配关系,通过空间分辨率降低去除视频中的冗余信息,进一步优化了推理效率。在 ARID1.5、HMDB51-Dark 和 UAV-human-night 数据集上进行了广泛的实验。结果表明,所提出的 Dark-DSAR 在 ARID1.5 上获得了最佳的 Top-1 准确率 89.49%,比最先进的方法高 2.56%,在 HMDB51-Dark 和 UAV-human-night 上的准确率分别为 67.13%和 61.9%。此外,消融实验表明,当配备我们的 DSM 时,动作分类器的准确率可以比原始模型提高≥1%。

相似文献

1
Dark-DSAR: Lightweight one-step pipeline for action recognition in dark videos.Dark-DSAR:黑暗视频中动作识别的轻量级一站式流水线。
Neural Netw. 2024 Nov;179:106622. doi: 10.1016/j.neunet.2024.106622. Epub 2024 Aug 8.
2
DTCM: Joint Optimization of Dark Enhancement and Action Recognition in Videos.深度时态对比学习:视频中暗部增强与动作识别的联合优化
IEEE Trans Image Process. 2023;32:3507-3520. doi: 10.1109/TIP.2023.3286254. Epub 2023 Jun 23.
3
Basketball technique action recognition using 3D convolutional neural networks.基于 3D 卷积神经网络的篮球技术动作识别
Sci Rep. 2024 Jun 7;14(1):13156. doi: 10.1038/s41598-024-63621-8.
4
Enhanced Hybrid Vision Transformer with Multi-Scale Feature Integration and Patch Dropping for Facial Expression Recognition.基于多尺度特征融合和补丁丢弃的增强型混合视觉 Transformer 在面部表情识别中的应用。
Sensors (Basel). 2024 Jun 26;24(13):4153. doi: 10.3390/s24134153.
5
CDGT: Constructing diverse graph transformers for emotion recognition from facial videos.构建用于面部视频情感识别的多样化图变换模型。
Neural Netw. 2024 Nov;179:106573. doi: 10.1016/j.neunet.2024.106573. Epub 2024 Jul 25.
6
Global and Local Knowledge-Aware Attention Network for Action Recognition.用于动作识别的全局和局部知识感知注意力网络。
IEEE Trans Neural Netw Learn Syst. 2021 Jan;32(1):334-347. doi: 10.1109/TNNLS.2020.2978613. Epub 2021 Jan 4.
7
A generalized pyramid matching kernel for human action recognition in realistic videos.用于现实视频中人体动作识别的广义金字塔匹配核。
Sensors (Basel). 2013 Oct 24;13(11):14398-416. doi: 10.3390/s131114398.
8
MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module.MEST:一种具有运动编码器和时空模块的动作识别网络。
Sensors (Basel). 2022 Sep 1;22(17):6595. doi: 10.3390/s22176595.
9
Deep Manifold Learning Combined With Convolutional Neural Networks for Action Recognition.基于深度流形学习与卷积神经网络的动作识别。
IEEE Trans Neural Netw Learn Syst. 2018 Sep;29(9):3938-3952. doi: 10.1109/TNNLS.2017.2740318. Epub 2017 Sep 15.
10
Weakly supervised temporal action localization with actionness-guided false positive suppression.基于动作引导型假阳性抑制的弱监督时间动作定位。
Neural Netw. 2024 Jul;175:106307. doi: 10.1016/j.neunet.2024.106307. Epub 2024 Apr 15.