用于基于自监督3D骨架的动作识别的对比掩码学习

Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition.

作者信息

Zhang Haoyuan

机构信息

School of Electrical and Information Engineering, North Minzu Univeristy, Yinchuan 750021, China.

出版信息

Sensors (Basel). 2025 Feb 28;25(5):1521. doi: 10.3390/s25051521.

DOI:10.3390/s25051521

PMID:40096387

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11902868/

Abstract

In this paper, we propose a contrastive mask learning (CML) method for self-supervised 3D skeleton-based action recognition. Specifically, the mask modeling mechanism is integrated into multi-level contrastive learning with the aim of forming a mutually beneficial learning scheme from both contrastive learning and masked skeleton reconstruction. The contrastive objective is extended from an individual skeleton instance to clusters by closing the gap between cluster assignment from different instances of the same category, with the goal of pursuing inter-instance consistency. Compared with previous methods, CML integrates contrastive and masked learning comprehensively and enables intra-/inter-instance consistency pursuit via multi-level contrast, which leads to more discriminative skeleton representation learning. Our extensive evaluation of the challenging NTU RGB+D and PKU-MMD benchmarks demonstrates that representations learned via CML exhibit superior discriminability, consistently outperforming state-of-the-art methods in terms of action recognition accuracy.

摘要

在本文中，我们提出了一种用于基于自监督3D骨架的动作识别的对比掩码学习（CML）方法。具体而言，掩码建模机制被集成到多级对比学习中，目的是从对比学习和掩码骨架重建中形成一种互利的学习方案。通过缩小同一类不同实例的聚类分配之间的差距，将对比目标从单个骨架实例扩展到聚类，以追求实例间的一致性。与先前的方法相比，CML全面集成了对比学习和掩码学习，并通过多级对比实现了实例内/实例间一致性的追求，从而导致更具判别力的骨架表示学习。我们对具有挑战性的NTU RGB+D和PKU-MMD基准进行的广泛评估表明，通过CML学习到的表示具有卓越的判别能力，在动作识别准确率方面始终优于现有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8282/11902868/856de6333b87/sensors-25-01521-g001.jpg

相似文献

Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition.

Sensors (Basel). 2025 Feb 28;25(5):1521. doi: 10.3390/s25051521.

Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition.

IEEE Trans Image Process. 2022;31:6224-6238. doi: 10.1109/TIP.2022.3207577. Epub 2022 Sep 28.

X-Invariant Contrastive Augmentation and Representation Learning for Semi-Supervised Skeleton-Based Action Recognition.

IEEE Trans Image Process. 2022;31:3852-3867. doi: 10.1109/TIP.2022.3175605. Epub 2022 Jun 2.

DMMG: Dual Min-Max Games for Self-Supervised Skeleton-Based Action Recognition.

IEEE Trans Image Process. 2024;33:395-407. doi: 10.1109/TIP.2023.3338410. Epub 2023 Dec 27.

Multi-Granularity Anchor-Contrastive Representation Learning for Semi-Supervised Skeleton-Based Action Recognition.

IEEE Trans Pattern Anal Mach Intell. 2023 Jun;45(6):7559-7576. doi: 10.1109/TPAMI.2022.3222871. Epub 2023 May 5.

Self-Supervised 3D Action Representation Learning With Skeleton Cloud Colorization.

IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):509-524. doi: 10.1109/TPAMI.2023.3325463. Epub 2023 Dec 5.

Momentum Contrastive Teacher for Semi-Supervised Skeleton Action Recognition.

IEEE Trans Image Process. 2025 Jan 1;PP. doi: 10.1109/TIP.2024.3522818.

Mutual Information Driven Equivariant Contrastive Learning for 3D Action Representation Learning.

IEEE Trans Image Process. 2024;33:1883-1897. doi: 10.1109/TIP.2024.3372451. Epub 2024 Mar 12.

Self-Supervised Action Representation Learning Based on Asymmetric Skeleton Data Augmentation.

Sensors (Basel). 2022 Nov 20;22(22):8989. doi: 10.3390/s22228989.

ConMLP: MLP-Based Self-Supervised Contrastive Learning for Skeleton Data Analysis and Action Recognition.

Sensors (Basel). 2023 Feb 22;23(5):2452. doi: 10.3390/s23052452.

本文引用的文献

Contrastive Masked Autoencoders are Stronger Vision Learners.

IEEE Trans Pattern Anal Mach Intell. 2024 Apr;46(4):2506-2517. doi: 10.1109/TPAMI.2023.3336525. Epub 2024 Mar 6.

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于基于自监督3D骨架的动作识别的对比掩码学习

Contrastive Mask Learning for Self-Supervised 3D Skeleton-Based Action Recognition.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献