IEEE Trans Vis Comput Graph. 2023 Aug;29(8):3519-3534. doi: 10.1109/TVCG.2022.3163676. Epub 2023 Jun 29.
Synthesizing human motion with a global structure, such as a choreography, is a challenging task. Existing methods tend to concentrate on local smooth pose transitions and neglect the global context or the theme of the motion. In this work, we present a music-driven motion synthesis framework that generates long-term sequences of human motions which are synchronized with the input beats, and jointly form a global structure that respects a specific dance genre. In addition, our framework enables generation of diverse motions that are controlled by the content of the music, and not only by the beat. Our music-driven dance synthesis framework is a hierarchical system that consists of three levels: pose, motif, and choreography. The pose level consists of an LSTM component that generates temporally coherent sequences of poses. The motif level guides sets of consecutive poses to form a movement that belongs to a specific distribution using a novel motion perceptual-loss. And the choreography level selects the order of the performed movements and drives the system to follow the global structure of a dance genre. Our results demonstrate the effectiveness of our music-driven framework to generate natural and consistent movements on various dance types, having control over the content of the synthesized motions, and respecting the overall structure of the dance.
用全局结构(如编舞)合成人体运动是一项具有挑战性的任务。现有的方法往往侧重于局部平滑的姿势转换,而忽略了运动的全局背景或主题。在这项工作中,我们提出了一个音乐驱动的运动合成框架,该框架可以生成与输入节拍同步的长期人体运动序列,并共同形成一个尊重特定舞蹈风格的全局结构。此外,我们的框架还能够生成多种运动,这些运动不仅由节拍控制,还由音乐的内容控制。我们的音乐驱动的舞蹈合成框架是一个分层系统,由三个层次组成:姿势、动机和编舞。姿势层由一个 LSTM 组件组成,该组件生成时间连贯的姿势序列。动机层使用新颖的运动感知损失引导连续的姿势集形成属于特定分布的运动。编舞层选择所表演动作的顺序,并驱动系统遵循舞蹈风格的整体结构。我们的结果表明,我们的音乐驱动框架能够有效地生成各种舞蹈类型的自然和一致的动作,对合成动作的内容进行控制,并尊重舞蹈的整体结构。