Suppr超能文献

重新审视用于时域动作定位的锚定机制。

Revisiting Anchor Mechanisms for Temporal Action Localization.

作者信息

Yang Le, Peng Houwen, Zhang Dingwen, Fu Jianlong, Han Junwei

出版信息

IEEE Trans Image Process. 2020 Aug 19;PP. doi: 10.1109/TIP.2020.3016486.

Abstract

Most of the current action localization methods follow an anchor-based pipeline: depicting action instances by pre-defined anchors, learning to select the anchors closest to the ground truth, and predicting the confidence of anchors with refinements. Pre-defined anchors set prior about the location and duration for action instances, which facilitates the localization for common action instances but limits the flexibility for tackling action instances with drastic varieties, especially for extremely short or extremely long ones. To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points. Specifically, this module represents an action instance as a point with its distances to the starting boundary and ending boundary, alleviating the pre-defined anchor restrictions in terms of action localization and duration. The proposed anchor-free module is capable of predicting the action instances whose duration is either extremely short or extremely long. By combining the proposed anchor-free module with a conventional anchor-based module, we propose a novel action localization framework, called A2Net. The cooperation between anchor-free and anchor-based modules achieves superior performance to the state-of-the-art on THUMOS14 (45.5% vs. 42.8%). Furthermore, comprehensive experiments demonstrate the complementarity between the anchor-free and the anchor-based module, making A2Net simple but effective.

摘要

当前大多数动作定位方法都遵循基于锚点的流程

通过预定义的锚点描绘动作实例,学习选择最接近真实值的锚点,并通过细化预测锚点的置信度。预定义的锚点为动作实例的位置和持续时间设置了先验信息,这有助于常见动作实例的定位,但限制了处理具有极大变化的动作实例的灵活性,特别是对于极短或极长的动作实例。为了解决这个问题,本文提出了一种新颖的无锚点动作定位模块,该模块通过时间点辅助动作定位。具体来说,该模块将动作实例表示为一个点,以及它到起始边界和结束边界的距离,从而在动作定位和持续时间方面减轻了预定义锚点的限制。所提出的无锚点模块能够预测持续时间极短或极长的动作实例。通过将所提出的无锚点模块与传统的基于锚点的模块相结合,我们提出了一种新颖的动作定位框架,称为A2Net。无锚点模块和基于锚点的模块之间的协作在THUMOS14上取得了优于现有技术的性能(45.5%对42.8%)。此外,全面的实验证明了无锚点模块和基于锚点的模块之间的互补性,使得A2Net简单而有效。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验