• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于第一人称视角的桌面行为识别。

Desktop Action Recognition From First-Person Point-of-View.

出版信息

IEEE Trans Cybern. 2019 May;49(5):1616-1628. doi: 10.1109/TCYB.2018.2806381. Epub 2018 Feb 27.

DOI:10.1109/TCYB.2018.2806381
PMID:29994596
Abstract

Desktop action recognition from first-person view (egocentric) video is an important task due to its omnipresence in our daily life, and the ideal first-person viewing perspective for observing hand-object interactions. However, no previous research efforts have been dedicated on the benchmark of the task. In this paper, we first release a dataset of daily desktop actions recorded with a wearable camera and publish it as a benchmark for desktop action recognition. Regular desktop activities of six participants were recorded in egocentric video with a wide-angle head-mounted camera. In particular, we focus on five common desktop actions in which hands are involved. We provide original video data, action annotations at frame-level, and hand masks at pixel-level. We also propose a feature representation for the characterization of different desktop actions based on the spatial and temporal information of hands. In experiments, we illustrate the statistical information about the dataset, and evaluate the action recognition performance of different features as a baseline. The proposed method achieves promising performance for five action classes.

摘要

从第一人称视角(自顶向下)视频中进行桌面动作识别是一项重要任务,因为它在我们的日常生活中无处不在,并且是观察手-物交互的理想第一人称视角。然而,以前没有任何研究致力于该任务的基准测试。在本文中,我们首次发布了一个使用可穿戴相机记录的日常桌面动作数据集,并将其作为桌面动作识别的基准进行发布。六个参与者的常规桌面活动使用广角头戴式相机进行自顶向下视频记录。特别地,我们关注涉及手部的五种常见桌面动作。我们提供原始视频数据、逐帧的动作注释以及逐像素的手部蒙版。我们还提出了一种基于手部的空间和时间信息来描述不同桌面动作的特征表示方法。在实验中,我们说明了数据集的统计信息,并评估了不同特征的动作识别性能作为基线。所提出的方法在五类动作中取得了有希望的性能。

相似文献

1
Desktop Action Recognition From First-Person Point-of-View.基于第一人称视角的桌面行为识别。
IEEE Trans Cybern. 2019 May;49(5):1616-1628. doi: 10.1109/TCYB.2018.2806381. Epub 2018 Feb 27.
2
Explicit modeling of human-object interactions in realistic videos.真实视频中人类-物体交互的显式建模。
IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):835-48. doi: 10.1109/TPAMI.2012.175.
3
Deep Attention Network for Egocentric Action Recognition.基于深度注意力网络的自我中心动作识别。
IEEE Trans Image Process. 2019 Aug;28(8):3703-3713. doi: 10.1109/TIP.2019.2901707. Epub 2019 Feb 26.
4
Action search by example using randomized visual vocabularies.基于随机视觉词汇的实例动作搜索。
IEEE Trans Image Process. 2013 Jan;22(1):377-90. doi: 10.1109/TIP.2012.2216273. Epub 2012 Aug 30.
5
Multi-view human activity recognition in distributed camera sensor networks.分布式摄像机传感器网络中的多视角人体活动识别。
Sensors (Basel). 2013 Jul 8;13(7):8750-70. doi: 10.3390/s130708750.
6
Observing human-object interactions: using spatial and functional compatibility for recognition.观察人与物体的交互:利用空间和功能兼容性进行识别。
IEEE Trans Pattern Anal Mach Intell. 2009 Oct;31(10):1775-89. doi: 10.1109/TPAMI.2009.83.
7
Learning a Deep Model for Human Action Recognition from Novel Viewpoints.从新视角学习人类动作识别的深度模型。
IEEE Trans Pattern Anal Mach Intell. 2018 Mar;40(3):667-681. doi: 10.1109/TPAMI.2017.2691768. Epub 2017 Apr 6.
8
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.NTU RGB+D 120:用于三维人体活动理解的大规模基准测试。
IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.
9
A Multi-Modal Egocentric Activity Recognition Approach towards Video Domain Generalization.一种面向视频领域泛化的多模态自我中心活动识别方法。
Sensors (Basel). 2024 Apr 12;24(8):2491. doi: 10.3390/s24082491.
10
A general framework for tracking multiple people from a moving camera.从移动摄像机跟踪多个人的通用框架。
IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1577-91. doi: 10.1109/TPAMI.2012.248.

引用本文的文献

1
A union of deep learning and swarm-based optimization for 3D human action recognition.基于深度学习和群体智能优化的三维人体动作识别方法。
Sci Rep. 2022 Mar 31;12(1):5494. doi: 10.1038/s41598-022-09293-8.