在丰富特征层次结构中以动作中心的时间关系进行动作识别的推理。

Reasoning action-centric temporal relations at rich feature hierarchies for action recognition.

作者信息

Liang Manshu, Wu Wenbin, Chen Zhuolei, Han Tengfei, Zheng Yuan

机构信息

Electric Power Science Research Institute, State Grid Fujian Electric Power Co. Ltd., Fujian, China.

School of Computer Science, Civil Aviation Flight University of China, Deyang, China.

出版信息

PLoS One. 2025 Jul 24;20(7):e0327302. doi: 10.1371/journal.pone.0327302. eCollection 2025.

DOI:10.1371/journal.pone.0327302

PMID:40705733

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12288993/

Abstract

Reasoning temporal relations among action-related objects plays an important role in action recognition. However, previous approaches only focus the reasoning on high-level semantics and inevitably involve the background in reasoning. In this work, we propose to formulate the temporal relational reasoning in an action-centric and hierarchical style, with a novel Action-centric Temporal-relational Reasoning (ATR) block. Specifically, ATR comprises two components: an Action-related Region Locator (ARL) to locate the action-related regions via estimating the actionness, and an Efficient Action-centric Reasoner (EAR) to efficiently reason the temporal relations between the located regions so as to realize the action-centric reasoning. Thanks to its flexible and efficient designs, our ATR can be directly integrated into existing action recognition models at different depths, empowering the hierarchical reasoning on the action-centric temporal relations at the cost of minor computational overhead. We extensively evaluate our ATR block on three action recognition benchmarks, Something-Something V1, V2, and Kinetics, with the backbones of TSN, TSM, and SlowOnly. The consistent and notable improvements over the strong baselines sufficiently corroborate the effectiveness of ATR, along with the action-centric and hierarchical formulation for temporal relational reasoning. Our proposed approach provides potential practical significance for real-world scenarios.

摘要

推理与动作相关对象之间的时间关系在动作识别中起着重要作用。然而，先前的方法仅将推理聚焦于高级语义，并且不可避免地在推理中涉及背景信息。在这项工作中，我们提出以一种以动作为中心的分层方式来构建时间关系推理，采用一种新颖的以动作为中心的时间关系推理（ATR）模块。具体而言，ATR由两个组件组成：一个动作相关区域定位器（ARL），通过估计动作性来定位动作相关区域；以及一个高效的以动作为中心的推理器（EAR），用于高效地推理所定位区域之间的时间关系，从而实现以动作为中心的推理。由于其灵活且高效的设计，我们的ATR可以直接集成到不同深度的现有动作识别模型中，以较小的计算开销为代价，实现以动作为中心的时间关系的分层推理。我们在三个动作识别基准数据集Something-Something V1、V2和Kinetics上，使用TSN、TSM和SlowOnly作为骨干网络，对我们的ATR模块进行了广泛评估。相对于强大的基线模型，一致且显著的性能提升充分证实了ATR的有效性，以及以动作为中心的时间关系推理的分层构建方式。我们提出的方法为现实世界场景提供了潜在的实际意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82f5/12288993/e23e6dc31f85/pone.0327302.g001.jpg

相似文献

Reasoning action-centric temporal relations at rich feature hierarchies for action recognition.在丰富特征层次结构中以动作中心的时间关系进行动作识别的推理。

PLoS One. 2025 Jul 24;20(7):e0327302. doi: 10.1371/journal.pone.0327302. eCollection 2025.

Sexual Harassment and Prevention Training性骚扰与预防培训

Short-Term Memory Impairment短期记忆障碍

Direct composite resin fillings versus amalgam fillings for permanent posterior teeth.直接复合树脂充填与银汞合金充填用于永久性后牙。

Cochrane Database Syst Rev. 2021 Aug 13;8(8):CD005620. doi: 10.1002/14651858.CD005620.pub3.

Potential of shifting work hours for reducing heat-related loss and regional disparities in China: a modelling analysis.调整工作时间对减少中国与高温相关的损失及地区差异的潜力：一项建模分析。

Lancet Planet Health. 2025 Jul 3. doi: 10.1016/S2542-5196(25)00079-8.

Sertindole for schizophrenia.用于治疗精神分裂症的舍吲哚。

Cochrane Database Syst Rev. 2005 Jul 20;2005(3):CD001715. doi: 10.1002/14651858.CD001715.pub2.

Learning together for mental health: feasibility of measures to assess a whole-school mental health and wellbeing intervention in secondary schools.共同学习促进心理健康：评估中学全校心理健康与幸福干预措施的可行性

Public Health Res (Southampt). 2025 Jun 25:1-18. doi: 10.3310/GFDT2323.

Use of endoanal ultrasound for reducing the risk of complications related to anal sphincter injury after vaginal birth.使用经肛门超声降低阴道分娩后肛门括约肌损伤相关并发症的风险。

Cochrane Database Syst Rev. 2015 Oct 29;2015(10):CD010826. doi: 10.1002/14651858.CD010826.pub2.

Antibiotics versus topical antiseptics for chronic suppurative otitis media.抗生素与外用消毒剂治疗慢性化脓性中耳炎的比较

Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013056. doi: 10.1002/14651858.CD013056.pub3.

The quantity, quality and findings of network meta-analyses evaluating the effectiveness of GLP-1 RAs for weight loss: a scoping review.评估胰高血糖素样肽-1受体激动剂（GLP-1 RAs）减肥效果的网状Meta分析的数量、质量及结果：一项范围综述

Health Technol Assess. 2025 Jun 25:1-73. doi: 10.3310/SKHT8119.

本文引用的文献

An Adaptive Solid-State Synapse with Bi-Directional Relaxation for Multimodal Recognition and Spatio-Temporal Learning.一种具有双向弛豫的自适应固态突触，用于多模态识别和时空学习。

Adv Mater. 2025 Apr;37(17):e2412006. doi: 10.1002/adma.202412006. Epub 2025 Mar 16.

A discriminative multi-modal adaptation neural network model for video action recognition.一种用于视频动作识别的判别式多模态自适应神经网络模型。

Neural Netw. 2025 May;185:107114. doi: 10.1016/j.neunet.2024.107114. Epub 2025 Jan 3.

CNN-LSTM Model for Recognizing Video-Recorded Actions Performed in a Traditional Chinese Exercise.用于识别传统中国功法中视频记录动作的 CNN-LSTM 模型。

IEEE J Transl Eng Health Med. 2023 Jun 2;11:351-359. doi: 10.1109/JTEHM.2023.3282245. eCollection 2023.

Human Action Recognition From Various Data Modalities: A Review.基于多种数据模态的人类行为识别综述

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3200-3225. doi: 10.1109/TPAMI.2022.3183112. Epub 2023 Feb 3.

Motion-Driven Visual Tempo Learning for Video-Based Action Recognition.基于运动驱动的视觉节奏学习的视频动作识别。

IEEE Trans Image Process. 2022;31:4104-4116. doi: 10.1109/TIP.2022.3180585. Epub 2022 Jun 20.

Temporal Reasoning Graph for Activity Recognition.用于活动识别的时态推理图

IEEE Trans Image Process. 2020 Apr 13. doi: 10.1109/TIP.2020.2985219.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在丰富特征层次结构中以动作中心的时间关系进行动作识别的推理。

Reasoning action-centric temporal relations at rich feature hierarchies for action recognition.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献