通过利用时域中的多分辨率信息改进弱监督时间动作定位

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-Resolution Information in Temporal Domain.

作者信息

Su Rui, Xu Dong, Zhou Luping, Ouyang Wanli

出版信息

IEEE Trans Image Process. 2021;30:6659-6672. doi: 10.1109/TIP.2021.3089355. Epub 2021 Jul 26.

DOI:10.1109/TIP.2021.3089355

Abstract

Weakly supervised temporal action localization is a challenging task as only the video-level annotation is available during the training process. To address this problem, we propose a two-stage approach to generate high-quality frame-level pseudo labels by fully exploiting multi-resolution information in the temporal domain and complementary information between the appearance (i.e., RGB) and motion (i.e., optical flow) streams. In the first stage, we propose an Initial Label Generation (ILG) module to generate reliable initial frame-level pseudo labels. Specifically, in this newly proposed module, we exploit temporal multi-resolution consistency and cross-stream consistency to generate high quality class activation sequences (CASs), which consist of a number of sequences with each sequence measuring how likely each video frame belongs to one specific action class. In the second stage, we propose a Progressive Temporal Label Refinement (PTLR) framework to iteratively refine the pseudo labels, in which we use a set of selected frames with highly confident pseudo labels to progressively train two networks and better predict action class scores at each frame. Specifically, in our newly proposed PTLR framework, two networks called Network-OTS and Network-RTS, which are respectively used to generate CASs for the original temporal scale and the reduced temporal scales, are used as two streams (i.e., the OTS stream and the RTS stream) to refine the pseudo labels in turn. By this way, multi-resolution information in the temporal domain is exchanged at the pseudo label level, and our work can help improve each network/stream by exploiting the refined pseudo labels from another network/stream. Comprehensive experiments on two benchmark datasets THUMOS14 and ActivityNet v1.3 demonstrate the effectiveness of our newly proposed method for weakly supervised temporal action localization.

摘要

弱监督时间动作定位是一项具有挑战性的任务，因为在训练过程中只有视频级别的标注可用。为了解决这个问题，我们提出了一种两阶段方法，通过充分利用时间域中的多分辨率信息以及外观（即RGB）和运动（即光流）流之间的互补信息来生成高质量的帧级伪标签。在第一阶段，我们提出了一个初始标签生成（ILG）模块来生成可靠的初始帧级伪标签。具体来说，在这个新提出的模块中，我们利用时间多分辨率一致性和跨流一致性来生成高质量的类别激活序列（CAS），该序列由多个序列组成，每个序列衡量每个视频帧属于一个特定动作类别的可能性。在第二阶段，我们提出了一个渐进式时间标签细化（PTLR）框架来迭代地细化伪标签，其中我们使用一组具有高度置信伪标签的选定帧来逐步训练两个网络，并更好地预测每个帧的动作类别分数。具体来说，在我们新提出的PTLR框架中，两个分别称为Network-OTS和Network-RTS的网络，它们分别用于为原始时间尺度和缩减后的时间尺度生成CAS，被用作两个流（即OTS流和RTS流）来依次细化伪标签。通过这种方式，在伪标签级别交换时间域中的多分辨率信息，并且我们的工作可以通过利用来自另一个网络/流的细化伪标签来帮助改进每个网络/流。在两个基准数据集THUMOS14和ActivityNet v1.3上进行的综合实验证明了我们新提出的方法对于弱监督时间动作定位的有效性。

相似文献

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-Resolution Information in Temporal Domain.通过利用时域中的多分辨率信息改进弱监督时间动作定位

IEEE Trans Image Process. 2021;30:6659-6672. doi: 10.1109/TIP.2021.3089355. Epub 2021 Jul 26.

Multi-Modality Self-Distillation for Weakly Supervised Temporal Action Localization.用于弱监督时间动作定位的多模态自蒸馏

IEEE Trans Image Process. 2022;31:1504-1519. doi: 10.1109/TIP.2021.3137649. Epub 2022 Jan 28.

Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization.用于动作定位的时空域渐进式跨流合作

IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4477-4490. doi: 10.1109/TPAMI.2020.2997860. Epub 2021 Nov 3.

Adaptive Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization.自适应双流共识网络的弱监督时间动作定位。

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4136-4151. doi: 10.1109/TPAMI.2022.3189662. Epub 2023 Mar 7.

Uncertainty Guided Collaborative Training for Weakly Supervised and Unsupervised Temporal Action Localization.不确定性引导的协作训练用于弱监督和无监督的时间动作定位。

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):5252-5267. doi: 10.1109/TPAMI.2022.3200399. Epub 2023 Mar 7.

Neighbor-Guided Pseudo-Label Generation and Refinement for Single-Frame Supervised Temporal Action Localization.用于单帧监督时域动作定位的邻居引导伪标签生成与优化

IEEE Trans Image Process. 2024;33:2419-2430. doi: 10.1109/TIP.2024.3378477. Epub 2024 Mar 29.

PCG-TAL: Progressive Cross-Granularity Cooperation for Temporal Action Localization.PCG-TAL：用于时间动作定位的渐进式跨粒度合作

IEEE Trans Image Process. 2021;30:2103-2113. doi: 10.1109/TIP.2020.3044218. Epub 2021 Jan 25.

Weakly Supervised Temporal Action Localization Through Contrast Based Evaluation Networks.基于对比评估网络的弱监督时间动作定位

IEEE Trans Pattern Anal Mach Intell. 2022 Sep;44(9):5886-5902. doi: 10.1109/TPAMI.2021.3078798. Epub 2022 Aug 4.

Weakly supervised salient object detection via image category annotation.通过图像类别标注实现弱监督显著目标检测。

Math Biosci Eng. 2023 Dec 1;20(12):21359-21381. doi: 10.3934/mbe.2023945.

Compact Representation and Reliable Classification Learning for Point-Level Weakly-Supervised Action Localization.用于点级弱监督动作定位的紧凑表示与可靠分类学习

IEEE Trans Image Process. 2022;31:7363-7377. doi: 10.1109/TIP.2022.3222623. Epub 2022 Nov 30.

通过利用时域中的多分辨率信息改进弱监督时间动作定位

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-Resolution Information in Temporal Domain.

作者信息

Su Rui, Xu Dong, Zhou Luping, Ouyang Wanli

出版信息

IEEE Trans Image Process. 2021;30:6659-6672. doi: 10.1109/TIP.2021.3089355. Epub 2021 Jul 26.

DOI:10.1109/TIP.2021.3089355

PMID:34166188

Abstract

摘要

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过利用时域中的多分辨率信息改进弱监督时间动作定位

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-Resolution Information in Temporal Domain.

作者信息

出版信息

相似文献

通过利用时域中的多分辨率信息改进弱监督时间动作定位

Improving Weakly Supervised Temporal Action Localization by Exploiting Multi-Resolution Information in Temporal Domain.

作者信息

出版信息

相似文献