Zhong Jianqi, Ye Conghui, Cao Wenming, Wang Hao
Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen, 518060, China.
State Key Laboratory of Radio Frequency Heterogeneous Integration, Shenzhen, 518060, China.
Sci Rep. 2024 Oct 30;14(1):26058. doi: 10.1038/s41598-024-75782-7.
It is noted that Recurrent Neural Networks (RNNs), which are widely used in human prediction tasks, have achieved promising performance in motion prediction, owing to RNNs' robust capacity for spatial-temporal sequence modeling. However, RNN-based methods suffer from error accumulation due to their step-by-step prediction mechanism. Therefore, in this paper, we propose a three-stage parallel prediction network, which guides the output generation of these three networks with different objectives. In particular, we leverage the high-dimensional information in these three networks to fuse new information to generate the final output. In addition, we also designed a fusion block based on GRU and attention mechanism to extract high-dimensional information more efficiently. Extensive experiments show that our approach outperforms most recent methods in both short and long-term motion predictions on Human 3.6M, CMU Mocap, and 3DPW.
值得注意的是,循环神经网络(RNNs)因其对时空序列建模的强大能力,在人类预测任务中得到广泛应用,并在运动预测方面取得了可观的性能。然而,基于RNN的方法由于其逐步预测机制而存在误差累积问题。因此,在本文中,我们提出了一种三阶段并行预测网络,该网络以不同目标引导这三个网络的输出生成。具体而言,我们利用这三个网络中的高维信息来融合新信息以生成最终输出。此外,我们还设计了一个基于门控循环单元(GRU)和注意力机制的融合模块,以更高效地提取高维信息。大量实验表明,我们的方法在Human 3.6M、CMU Mocap和3DPW数据集的短期和长期运动预测中均优于最新方法。