Mao Makara, Va Hongly, Hong Min
Department of Software Convergence, Soonchunhyang University, Asan 31538, Republic of Korea.
Department of Computer Software Engineering, Soonchunhyang University, Asan 31538, Republic of Korea.
Sensors (Basel). 2024 Jan 15;24(2):549. doi: 10.3390/s24020549.
In virtual reality, augmented reality, or animation, the goal is to represent the movement of deformable objects in the real world as similar as possible in the virtual world. Therefore, this paper proposed a method to automatically extract cloth stiffness values from video scenes, and then they are applied as material properties for virtual cloth simulation. We propose the use of deep learning (DL) models to tackle this issue. The Transformer model, in combination with pre-trained architectures like DenseNet121, ResNet50, VGG16, and VGG19, stands as a leading choice for video classification tasks. Position-Based Dynamics (PBD) is a computational framework widely used in computer graphics and physics-based simulations for deformable entities, notably cloth. It provides an inherently stable and efficient way to replicate complex dynamic behaviors, such as folding, stretching, and collision interactions. Our proposed model characterizes virtual cloth based on softness-to-stiffness labels and accurately categorizes videos using this labeling. The cloth movement dataset utilized in this research is derived from a meticulously designed stiffness-oriented cloth simulation. Our experimental assessment encompasses an extensive dataset of 3840 videos, contributing to a multi-label video classification dataset. Our results demonstrate that our proposed model achieves an impressive average accuracy of 99.50%. These accuracies significantly outperform alternative models such as RNN, GRU, LSTM, and Transformer.
在虚拟现实、增强现实或动画中,目标是在虚拟世界中尽可能逼真地呈现现实世界中可变形物体的运动。因此,本文提出了一种从视频场景中自动提取布料刚度值的方法,然后将这些值用作虚拟布料模拟的材料属性。我们建议使用深度学习(DL)模型来解决这个问题。Transformer模型与DenseNet121、ResNet50、VGG16和VGG19等预训练架构相结合,是视频分类任务的首选。基于位置的动力学(PBD)是一种计算框架,广泛应用于计算机图形学和基于物理的可变形实体模拟,特别是布料。它提供了一种本质上稳定且高效的方式来复制复杂的动态行为,如折叠、拉伸和碰撞相互作用。我们提出的模型基于柔软度到刚度标签对虚拟布料进行表征,并使用此标签准确地对视频进行分类。本研究中使用的布料运动数据集来自精心设计的面向刚度的布料模拟。我们的实验评估涵盖了一个包含3840个视频的广泛数据集,形成了一个多标签视频分类数据集。我们的结果表明,我们提出的模型实现了令人印象深刻的99.50%的平均准确率。这些准确率显著优于RNN、GRU、LSTM和Transformer等替代模型。