• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于跨视图三维动作识别的判别式多视图动态图像融合

Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition.

作者信息

Wang Yancheng, Xiao Yang, Lu Junyi, Tan Bo, Cao Zhiguo, Zhang Zhenjun, Zhou Joey Tianyi

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5332-5345. doi: 10.1109/TNNLS.2021.3070179. Epub 2022 Oct 5.

DOI:10.1109/TNNLS.2021.3070179
PMID:33852396
Abstract

Dramatic imaging viewpoint variation is the critical challenge toward action recognition for depth video. To address this, one feasible way is to enhance view-tolerance of visual feature, while still maintaining strong discriminative capacity. Multi-view dynamic image (MVDI) is the most recently proposed 3-D action representation manner that is able to compactly encode human motion information and 3-D visual clue well. However, it is still view-sensitive. To leverage its performance, a discriminative MVDI fusion method is proposed by us via multi-instance learning (MIL). Specifically, the dynamic images (DIs) from different observation viewpoints are regarded as the instances for 3-D action characterization. After being encoded using Fisher vector (FV), they are then aggregated by sum-pooling to yield the representative 3-D action signature. Our insight is that viewpoint aggregation helps to enhance view-tolerance. And, FV can map the raw DI feature to the higher dimensional feature space to promote the discriminative power. Meanwhile, a discriminative viewpoint instance discovery method is also proposed to discard the viewpoint instances unfavorable for action characterization. The wide-range experiments on five data sets demonstrate that our proposition can significantly enhance the performance of cross-view 3-D action recognition. And, it is also applicable to cross-view 3-D object recognition. The source code is available at https://github.com/3huo/ActionView.

摘要

深度视频动作识别面临的关键挑战是显著的成像视角变化。为解决这一问题,一种可行的方法是提高视觉特征的视角容忍度,同时保持强大的辨别能力。多视角动态图像(MVDI)是最近提出的一种三维动作表示方式,它能够紧凑地编码人体运动信息和三维视觉线索。然而,它仍然对视角敏感。为了利用其性能,我们通过多实例学习(MIL)提出了一种有辨别力的MVDI融合方法。具体来说,将来自不同观察视角的动态图像(DI)视为用于三维动作表征的实例。使用Fisher向量(FV)进行编码后,通过求和池化将它们聚合起来,以产生具有代表性的三维动作特征。我们的见解是,视角聚合有助于提高视角容忍度。并且,FV可以将原始的DI特征映射到更高维的特征空间,以提升辨别能力。同时,还提出了一种有辨别力的视角实例发现方法,以舍弃不利于动作表征的视角实例。在五个数据集上进行的广泛实验表明,我们的方法可以显著提高跨视角三维动作识别的性能。而且,它也适用于跨视角三维物体识别。源代码可在https://github.com/3huo/ActionView获取。

相似文献

1
Discriminative Multi-View Dynamic Image Fusion for Cross-View 3-D Action Recognition.用于跨视图三维动作识别的判别式多视图动态图像融合
IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5332-5345. doi: 10.1109/TNNLS.2021.3070179. Epub 2022 Oct 5.
2
Beyond Pattern Variance: Unsupervised 3-D Action Representation Learning With Point Cloud Sequence.
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):18186-18199. doi: 10.1109/TNNLS.2023.3312673. Epub 2024 Dec 2.
3
Cross-View Action Recognition Over Heterogeneous Feature Spaces.跨视图动作识别的异构特征空间。
IEEE Trans Image Process. 2015 Nov;24(11):4096-108. doi: 10.1109/TIP.2015.2445293. Epub 2015 Jun 12.
4
Dual-Recommendation Disentanglement Network for View Fuzz in Action Recognition.用于动作识别中视图模糊的双推荐解缠网络
IEEE Trans Image Process. 2023;32:2719-2733. doi: 10.1109/TIP.2023.3273459. Epub 2023 May 16.
5
Multi-Domain & Multi-Task Learning for Human Action Recognition.用于人类动作识别的多领域与多任务学习
IEEE Trans Image Process. 2018 Sep 28. doi: 10.1109/TIP.2018.2872879.
6
A discriminative model of motion and cross ratio for view-invariant action recognition.一种用于视图不变动作识别的运动和交比判别模型。
IEEE Trans Image Process. 2012 Apr;21(4):2187-97. doi: 10.1109/TIP.2011.2176346.
7
Specificity and Latent Correlation Learning for Action Recognition Using Synthetic Multi-View Data From Depth Maps.使用深度图的合成多视角数据进行动作识别的特异性和潜在关联学习。
IEEE Trans Image Process. 2017 Dec;26(12):5560-5574. doi: 10.1109/TIP.2017.2740122. Epub 2017 Aug 14.
8
Dynamic Spatio-Temporal Bag of Expressions (D-STBoE) Model for Human Action Recognition.用于人体动作识别的动态时空词袋(D-STBoE)模型。
Sensors (Basel). 2019 Jun 21;19(12):2790. doi: 10.3390/s19122790.
9
Multi-Scale Multi-View Deep Feature Aggregation for Food Recognition.多尺度多视角深度特征聚合的食物识别方法。
IEEE Trans Image Process. 2020;29:265-276. doi: 10.1109/TIP.2019.2929447. Epub 2019 Jul 29.
10
Joint Representation Learning and Keypoint Detection for Cross-View Geo-Localization.跨视图地理定位的联合表示学习和关键点检测。
IEEE Trans Image Process. 2022;31:3780-3792. doi: 10.1109/TIP.2022.3175601. Epub 2022 Jun 2.

引用本文的文献

1
Multiview child motor development dataset for AI-driven assessment of child development.多视图儿童运动发育数据集,用于人工智能驱动的儿童发育评估。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad039. Epub 2023 May 27.
2
In-Home Older Adults' Activity Pattern Monitoring Using Depth Sensors: A Review.基于深度传感器的居家老年人活动模式监测:综述
Sensors (Basel). 2022 Nov 23;22(23):9067. doi: 10.3390/s22239067.