• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用事件相机的动作识别与基准测试

Action Recognition and Benchmark Using Event Cameras.

作者信息

Gao Yue, Lu Jiaxuan, Li Siqi, Ma Nan, Du Shaoyi, Li Yipeng, Dai Qionghai

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14081-14097. doi: 10.1109/TPAMI.2023.3300741. Epub 2023 Nov 3.

DOI:10.1109/TPAMI.2023.3300741
PMID:37527291
Abstract

Recent years have witnessed remarkable achievements in video-based action recognition. Apart from traditional frame-based cameras, event cameras are bio-inspired vision sensors that only record pixel-wise brightness changes rather than the brightness value. However, little effort has been made in event-based action recognition, and large-scale public datasets are also nearly unavailable. In this paper, we propose an event-based action recognition framework called EV-ACT. The Learnable Multi-Fused Representation (LMFR) is first proposed to integrate multiple event information in a learnable manner. The LMFR with dual temporal granularity is fed into the event-based slow-fast network for the fusion of appearance and motion features. A spatial-temporal attention mechanism is introduced to further enhance the learning capability of action recognition. To prompt research in this direction, we have collected the largest event-based action recognition benchmark named THU-50 and the accompanying THU-50-CHL dataset under challenging environments, including a total of over 12,830 recordings from 50 action categories, which is over 4 times the size of the previous largest dataset. Experimental results show that our proposed framework could achieve improvements of over 14.5%, 7.6%, 11.2%, and 7.4% compared to previous works on four benchmarks. We have also deployed our proposed EV-ACT framework on a mobile platform to validate its practicality and efficiency.

摘要

近年来,基于视频的动作识别取得了显著成就。除了传统的基于帧的相机外,事件相机是受生物启发的视觉传感器,它只记录逐像素的亮度变化而不是亮度值。然而,基于事件的动作识别方面的研究还很少,大规模的公共数据集也几乎没有。在本文中,我们提出了一个基于事件的动作识别框架,称为EV-ACT。首次提出了可学习的多融合表示(LMFR),以可学习的方式整合多个事件信息。将具有双重时间粒度的LMFR输入到基于事件的慢-快网络中,用于融合外观和运动特征。引入了时空注意力机制,以进一步增强动作识别的学习能力。为了推动这一方向的研究,我们收集了最大的基于事件的动作识别基准数据集THU-50以及具有挑战性环境下的配套THU-50-CHL数据集,其中包括来自50个动作类别的超过12,830条记录,这是之前最大数据集规模的4倍多。实验结果表明,与之前在四个基准上的工作相比,我们提出的框架可以实现超过14.5%、7.6%、11.2%和7.4%的提升。我们还将提出的EV-ACT框架部署在移动平台上,以验证其实用性和效率。

相似文献

1
Action Recognition and Benchmark Using Event Cameras.使用事件相机的动作识别与基准测试
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14081-14097. doi: 10.1109/TPAMI.2023.3300741. Epub 2023 Nov 3.
2
Hypergraph-Based Multi-View Action Recognition Using Event Cameras.基于超图的事件相机多视图动作识别
IEEE Trans Pattern Anal Mach Intell. 2024 Oct;46(10):6610-6622. doi: 10.1109/TPAMI.2024.3382117. Epub 2024 Sep 6.
3
SuperFast: 200× Video Frame Interpolation via Event Camera.超快速:基于事件相机的 200× 视频帧插补。
IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7764-7780. doi: 10.1109/TPAMI.2022.3224051. Epub 2023 May 5.
4
Event-Based Vision: A Survey.基于事件的视觉:综述。
IEEE Trans Pattern Anal Mach Intell. 2022 Jan;44(1):154-180. doi: 10.1109/TPAMI.2020.3008413. Epub 2021 Dec 7.
5
Multi-Stage Network for Event-Based Video Deblurring with Residual Hint Attention.基于残差提示注意力的多阶段事件视频去模糊网络。
Sensors (Basel). 2023 Mar 7;23(6):2880. doi: 10.3390/s23062880.
6
Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks.基于深度神经网络的人类步态识别的事件流表示。
IEEE Trans Pattern Anal Mach Intell. 2022 Jul;44(7):3436-3449. doi: 10.1109/TPAMI.2021.3054886. Epub 2022 Jun 3.
7
FLGR: Fixed Length Gists Representation Learning for RNN-HMM Hybrid-Based Neuromorphic Continuous Gesture Recognition.FLGR:用于基于RNN-HMM混合模型的神经形态连续手势识别的定长要点表示学习
Front Neurosci. 2019 Feb 12;13:73. doi: 10.3389/fnins.2019.00073. eCollection 2019.
8
A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.用于小规模深度视频数据集动作识别的深度序列学习框架。
Sensors (Basel). 2022 Sep 9;22(18):6841. doi: 10.3390/s22186841.
9
VisEvent: Reliable Object Tracking via Collaboration of Frame and Event Flows.视觉事件:通过帧流与事件流协作实现可靠目标跟踪
IEEE Trans Cybern. 2024 Mar;54(3):1997-2010. doi: 10.1109/TCYB.2023.3318601. Epub 2024 Feb 9.
10
EV-LFV: Synthesizing Light Field Event Streams from an Event Camera and Multiple RGB Cameras.事件视频光场视频:从事件相机和多个RGB相机合成光场事件流
IEEE Trans Vis Comput Graph. 2023 Nov;29(11):4546-4555. doi: 10.1109/TVCG.2023.3320271. Epub 2023 Nov 2.

引用本文的文献

1
Spike-HAR++: an energy-efficient and lightweight parallel spiking transformer for event-based human action recognition.Spike-HAR++:一种用于基于事件的人类动作识别的高效节能轻量级并行脉冲变压器。
Front Comput Neurosci. 2024 Nov 26;18:1508297. doi: 10.3389/fncom.2024.1508297. eCollection 2024.