使用简单网络在运动预测中实现视距和视角不变性。

Achieving view-distance and -angle invariance in motion prediction using a simple network.

作者信息

Zhao Haichuan, Ru Xudong, Du Peng, Liu Shaolong, Liu Na, Wang Xingce, Wu Zhongke

机构信息

School of Artificial Intelligence, Beijing Normal University, Beijing, 100875, China.

School of Arts and Communication, Beijing Normal University, Beijing, 100875, China.

出版信息

Vis Comput Ind Biomed Art. 2024 Oct 28;7(1):26. doi: 10.1186/s42492-024-00176-5.

DOI:10.1186/s42492-024-00176-5

PMID:39466577

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11519277/

Abstract

Recently, human motion prediction has gained significant attention and achieved notable success. However, current methods primarily rely on training and testing with ideal datasets, overlooking the impact of variations in the viewing distance and viewing angle, which are commonly encountered in practical scenarios. In this study, we address the issue of model invariance by ensuring robust performance despite variations in view distances and angles. To achieve this, we employed Riemannian geometry methods to constrain the learning process of neural networks, enabling the prediction of invariances using a simple network. Furthermore, this enhances the application of motion prediction in various scenarios. Our framework uses Riemannian geometry to encode motion into a novel motion space to achieve prediction with an invariant viewing distance and angle using a simple network. Specifically, the specified path transport square-root velocity function is proposed to aid in removing the view-angle equivalence class and encode motion sequences into a flattened space. Motion coding by the geometry method linearizes the optimization problem in a non-flattened space and effectively extracts motion information, allowing the proposed method to achieve competitive performance using a simple network. Experimental results on Human 3.6M and CMU MoCap demonstrate that the proposed framework has competitive performance and invariance to the viewing distance and viewing angle.

摘要

近年来，人体运动预测受到了广泛关注并取得了显著成果。然而，目前的方法主要依赖于在理想数据集上进行训练和测试，忽略了实际场景中常见的观看距离和视角变化的影响。在本研究中，我们通过确保在不同观看距离和视角变化下仍具有稳健性能来解决模型不变性问题。为实现这一目标，我们采用黎曼几何方法来约束神经网络的学习过程，从而能够使用简单网络预测不变性。此外，这增强了运动预测在各种场景中的应用。我们的框架使用黎曼几何将运动编码到一个新颖的运动空间中，以便使用简单网络实现具有不变观看距离和视角的预测。具体而言，提出了特定的路径传输平方根速度函数，以帮助消除视角等价类并将运动序列编码到一个扁平空间中。通过几何方法进行的运动编码使非扁平空间中的优化问题线性化，并有效地提取运动信息，从而使所提出的方法能够使用简单网络实现具有竞争力的性能。在Human 3.6M和CMU MoCap上的实验结果表明，所提出的框架具有竞争力的性能以及对观看距离和视角的不变性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f694/11519277/57c46a743866/42492_2024_176_Fig1_HTML.jpg

相似文献

Achieving view-distance and -angle invariance in motion prediction using a simple network.使用简单网络在运动预测中实现视距和视角不变性。

Vis Comput Ind Biomed Art. 2024 Oct 28;7(1):26. doi: 10.1186/s42492-024-00176-5.

Viewing-distance invariance of movement detection.运动检测的观看距离不变性。

Exp Brain Res. 1992;91(1):135-50. doi: 10.1007/BF00230022.

QMEDNet: A quaternion-based multi-order differential encoder-decoder model for 3D human motion prediction.QMEDNet：一种基于四元数的多阶微分编解码器模型，用于三维人体运动预测。

Neural Netw. 2022 Oct;154:141-151. doi: 10.1016/j.neunet.2022.07.005. Epub 2022 Jul 14.

Multi-branch deep learning neural network prediction model for the development of angular biosensors based on sEMG.基于表面肌电图的角度生物传感器发展的多分支深度学习神经网络预测模型

Front Bioeng Biotechnol. 2024 Oct 11;12:1492232. doi: 10.3389/fbioe.2024.1492232. eCollection 2024.

AMHGCN: Adaptive multi-level hypergraph convolution network for human motion prediction.AMHGCN：用于人体运动预测的自适应多层次超图卷积网络。

Neural Netw. 2024 Apr;172:106153. doi: 10.1016/j.neunet.2024.106153. Epub 2024 Jan 29.

Human Motion Prediction via Dual-Attention and Multi-Granularity Temporal Convolutional Networks.基于双注意力和多粒度时间卷积网络的人体运动预测。

Sensors (Basel). 2023 Jun 16;23(12):5653. doi: 10.3390/s23125653.

Arbitrary View Action Recognition via Transfer Dictionary Learning on Synthetic Training Data.基于合成训练数据的迁移字典学习实现任意视角动作识别

IEEE Trans Image Process. 2018 May 15. doi: 10.1109/TIP.2018.2836323.

Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based Motion Prediction.基于多尺度时空图神经网络的 3D 骨骼运动预测

IEEE Trans Image Process. 2021;30:7760-7775. doi: 10.1109/TIP.2021.3108708. Epub 2021 Sep 14.

Lung tumor segmentation in 4D CT images using motion convolutional neural networks.使用运动卷积神经网络进行 4D CT 图像中的肺部肿瘤分割。

Med Phys. 2021 Nov;48(11):7141-7153. doi: 10.1002/mp.15204. Epub 2021 Sep 13.

Parallel multi-stage rectification networks for 3D skeleton-based motion prediction.用于基于3D骨架的运动预测的并行多阶段整流网络。

Sci Rep. 2024 Oct 30;14(1):26058. doi: 10.1038/s41598-024-75782-7.

本文引用的文献

STTG-net: a Spatio-temporal network for human motion prediction based on transformer and graph convolution network.STTG网络：一种基于Transformer和图卷积网络的人体运动预测时空网络。

Vis Comput Ind Biomed Art. 2022 Jul 29;5(1):19. doi: 10.1186/s42492-022-00112-5.

Investigating Pose Representations and Motion Contexts Modeling for 3D Motion Prediction.研究用于3D运动预测的姿态表示和运动上下文建模。

IEEE Trans Pattern Anal Mach Intell. 2023 Jan;45(1):681-697. doi: 10.1109/TPAMI.2021.3139918. Epub 2022 Dec 5.

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding.NTU RGB+D 120：用于三维人体活动理解的大规模基准测试。

IEEE Trans Pattern Anal Mach Intell. 2020 Oct;42(10):2684-2701. doi: 10.1109/TPAMI.2019.2916873. Epub 2019 May 14.

Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments.Human3.6M：自然环境中 3D 人体感应的大规模数据集和预测方法。

IEEE Trans Pattern Anal Mach Intell. 2014 Jul;36(7):1325-39. doi: 10.1109/TPAMI.2013.248.

Shape Analysis of Elastic Curves in Euclidean Spaces.欧几里得空间中弹性曲线的形状分析。

IEEE Trans Pattern Anal Mach Intell. 2011 Jul;33(7):1415-28. doi: 10.1109/TPAMI.2010.184. Epub 2010 Oct 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用简单网络在运动预测中实现视距和视角不变性。

Achieving view-distance and -angle invariance in motion prediction using a simple network.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献