• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用多尺度时空局部特征连接作为自然动作结构的鲁棒动作识别。

Robust action recognition using multi-scale spatial-temporal concatenations of local features as natural action structures.

机构信息

Brain and Behavior Discovery Institute, Medical College of Georgia, Georgia Regents University, Augusta, Georgia, United States of America.

出版信息

PLoS One. 2012;7(10):e46686. doi: 10.1371/journal.pone.0046686. Epub 2012 Oct 4.

DOI:10.1371/journal.pone.0046686
PMID:23056403
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3464264/
Abstract

Human and many other animals can detect, recognize, and classify natural actions in a very short time. How this is achieved by the visual system and how to make machines understand natural actions have been the focus of neurobiological studies and computational modeling in the last several decades. A key issue is what spatial-temporal features should be encoded and what the characteristics of their occurrences are in natural actions. Current global encoding schemes depend heavily on segmenting while local encoding schemes lack descriptive power. Here, we propose natural action structures, i.e., multi-size, multi-scale, spatial-temporal concatenations of local features, as the basic features for representing natural actions. In this concept, any action is a spatial-temporal concatenation of a set of natural action structures, which convey a full range of information about natural actions. We took several steps to extract these structures. First, we sampled a large number of sequences of patches at multiple spatial-temporal scales. Second, we performed independent component analysis on the patch sequences and classified the independent components into clusters. Finally, we compiled a large set of natural action structures, with each corresponding to a unique combination of the clusters at the selected spatial-temporal scales. To classify human actions, we used a set of informative natural action structures as inputs to two widely used models. We found that the natural action structures obtained here achieved a significantly better recognition performance than low-level features and that the performance was better than or comparable to the best current models. We also found that the classification performance with natural action structures as features was slightly affected by changes of scale and artificially added noise. We concluded that the natural action structures proposed here can be used as the basic encoding units of actions and may hold the key to natural action understanding.

摘要

人类和许多其他动物能够在很短的时间内检测、识别和分类自然动作。视觉系统如何实现这一点,以及如何使机器理解自然动作,一直是过去几十年神经生物学研究和计算建模的焦点。一个关键问题是应该编码哪些时空特征,以及自然动作中特征的出现特征是什么。当前的全局编码方案严重依赖于分割,而局部编码方案缺乏描述能力。在这里,我们提出了自然动作结构,即局部特征的多尺寸、多尺度、时空串联,作为表示自然动作的基本特征。在这个概念中,任何动作都是一组自然动作结构的时空串联,这些结构传递了关于自然动作的全方位信息。我们采取了几个步骤来提取这些结构。首先,我们在多个时空尺度上对大量的补丁序列进行了采样。其次,我们对补丁序列进行了独立成分分析,并将独立成分分类为簇。最后,我们编译了一组大型的自然动作结构,每个结构对应于所选时空尺度上的唯一簇组合。为了对人类动作进行分类,我们使用了一组信息丰富的自然动作结构作为两个广泛使用的模型的输入。我们发现,这里获得的自然动作结构的识别性能明显优于低级特征,性能优于或与当前最好的模型相当。我们还发现,使用自然动作结构作为特征的分类性能受尺度变化和人为添加噪声的轻微影响。我们得出的结论是,这里提出的自然动作结构可以用作动作的基本编码单元,并且可能是理解自然动作的关键。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/f3a8b883aea1/pone.0046686.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/6721d4b59c37/pone.0046686.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/42b15d0ecf52/pone.0046686.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/bd84ec7e4368/pone.0046686.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/4ed931c756de/pone.0046686.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/e0820a951f6a/pone.0046686.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/ab0027f3af8c/pone.0046686.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/b0e6d351c28d/pone.0046686.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/e3494a90ffcb/pone.0046686.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/cc1fa1e5a89c/pone.0046686.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/d6ee3d927b96/pone.0046686.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/a1f8b281147c/pone.0046686.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/f3a8b883aea1/pone.0046686.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/6721d4b59c37/pone.0046686.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/42b15d0ecf52/pone.0046686.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/bd84ec7e4368/pone.0046686.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/4ed931c756de/pone.0046686.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/e0820a951f6a/pone.0046686.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/ab0027f3af8c/pone.0046686.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/b0e6d351c28d/pone.0046686.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/e3494a90ffcb/pone.0046686.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/cc1fa1e5a89c/pone.0046686.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/d6ee3d927b96/pone.0046686.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/a1f8b281147c/pone.0046686.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1435/3464264/f3a8b883aea1/pone.0046686.g012.jpg

相似文献

1
Robust action recognition using multi-scale spatial-temporal concatenations of local features as natural action structures.使用多尺度时空局部特征连接作为自然动作结构的鲁棒动作识别。
PLoS One. 2012;7(10):e46686. doi: 10.1371/journal.pone.0046686. Epub 2012 Oct 4.
2
Multi-scale spatial concatenations of local features in natural scenes and scene classification.自然场景中的局部特征的多尺度空间连接及其场景分类。
PLoS One. 2013 Sep 30;8(9):e76393. doi: 10.1371/journal.pone.0076393. eCollection 2013.
3
Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模
Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.
4
Data-driven spatio-temporal RGBD feature encoding for action recognition in operating rooms.用于手术室动作识别的数据驱动时空RGB-D特征编码
Int J Comput Assist Radiol Surg. 2015 Jun;10(6):737-47. doi: 10.1007/s11548-015-1186-1. Epub 2015 Apr 7.
5
In Vivo Observations of Rapid Scattered Light Changes Associated with Neurophysiological Activity与神经生理活动相关的快速散射光变化的体内观察
6
Performance of a Computational Model of the Mammalian Olfactory System哺乳动物嗅觉系统计算模型的性能
7
Slow feature analysis for human action recognition.慢特征分析在人类动作识别中的应用。
IEEE Trans Pattern Anal Mach Intell. 2012 Mar;34(3):436-50. doi: 10.1109/TPAMI.2011.157.
8
Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition.基于骨架的动作识别的时态动力学表示学习。
IEEE Trans Image Process. 2016 Jul;25(7):3010-3022. doi: 10.1109/TIP.2016.2552404. Epub 2016 Apr 8.
9
Self-organizing neural integration of pose-motion features for human action recognition.用于人体动作识别的姿态-运动特征自组织神经整合
Front Neurorobot. 2015 Jun 9;9:3. doi: 10.3389/fnbot.2015.00003. eCollection 2015.
10
Action recognition using mined hierarchical compound features.基于挖掘的层次化组合特征的动作识别。
IEEE Trans Pattern Anal Mach Intell. 2011 May;33(5):883-97. doi: 10.1109/TPAMI.2010.144.

引用本文的文献

1
Learning dictionaries of sparse codes of 3D movements of body joints for real-time human activity understanding.学习人体关节三维运动稀疏码字典,实现实时人体活动理解。
PLoS One. 2014 Dec 4;9(12):e114147. doi: 10.1371/journal.pone.0114147. eCollection 2014.
2
Remote measurements of heart and respiration rates for telemedicine.远程遥测心搏和呼吸频率用于远程医疗。
PLoS One. 2013 Oct 8;8(10):e71384. doi: 10.1371/journal.pone.0071384. eCollection 2013.
3
Multi-scale spatial concatenations of local features in natural scenes and scene classification.

本文引用的文献

1
A hierarchical probabilistic model for rapid object categorization in natural scenes.一种用于自然场景中快速目标分类的分层概率模型。
PLoS One. 2011;6(5):e20002. doi: 10.1371/journal.pone.0020002. Epub 2011 May 25.
2
Emergence of visual saliency from natural scenes via context-mediated probability distributions coding.通过上下文介导的概率分布编码从自然场景中出现视觉显著性。
PLoS One. 2010 Dec 29;5(12):e15796. doi: 10.1371/journal.pone.0015796.
3
A high-throughput screening approach to discovering good forms of biologically inspired visual representation.
自然场景中的局部特征的多尺度空间连接及其场景分类。
PLoS One. 2013 Sep 30;8(9):e76393. doi: 10.1371/journal.pone.0076393. eCollection 2013.
一种高通量筛选方法,用于发现具有良好生物学启发的视觉表示形式。
PLoS Comput Biol. 2009 Nov;5(11):e1000579. doi: 10.1371/journal.pcbi.1000579. Epub 2009 Nov 26.
4
Parallel processing strategies of the primate visual system.灵长类视觉系统的并行处理策略。
Nat Rev Neurosci. 2009 May;10(5):360-72. doi: 10.1038/nrn2619. Epub 2009 Apr 8.
5
An introduction to kernel-based learning algorithms.基于核的学习算法介绍。
IEEE Trans Neural Netw. 2001;12(2):181-201. doi: 10.1109/72.914517.
6
Excitatory and suppressive receptive field subunits in awake monkey primary visual cortex (V1).清醒猴初级视觉皮层(V1)中的兴奋性和抑制性感受野亚单位。
Proc Natl Acad Sci U S A. 2007 Nov 27;104(48):19120-5. doi: 10.1073/pnas.0706938104. Epub 2007 Nov 15.
7
Actions as space-time shapes.作为时空形态的行动。
IEEE Trans Pattern Anal Mach Intell. 2007 Dec;29(12):2247-53. doi: 10.1109/TPAMI.2007.70711.
8
A feedforward architecture accounts for rapid categorization.前馈架构有助于快速分类。
Proc Natl Acad Sci U S A. 2007 Apr 10;104(15):6424-9. doi: 10.1073/pnas.0700622104. Epub 2007 Apr 2.
9
Information processing in the primate retina: circuitry and coding.灵长类动物视网膜中的信息处理:电路与编码。
Annu Rev Neurosci. 2007;30:1-30. doi: 10.1146/annurev.neuro.30.051606.094252.
10
How MT cells analyze the motion of visual patterns.MT细胞如何分析视觉模式的运动。
Nat Neurosci. 2006 Nov;9(11):1421-31. doi: 10.1038/nn1786. Epub 2006 Oct 15.