Department of Computer Engineering, Kasetsart University, Bangkok, Thailand.
PLoS One. 2024 Aug 22;19(8):e0308852. doi: 10.1371/journal.pone.0308852. eCollection 2024.
In this paper, we propose a method to reduce the model architecture searching time. We consider MobileNetV2 for 3D face recognition tasks as a case study and introducing the layer replication to enhance accuracy. For a given network, various layers can be replicated, and effective replication can yield better accuracy. Our proposed algorithm identifies the optimal layer replication configuration for the model. We considered two acceleration methods: distributed data-parallel training and concurrent model training. Our experiments demonstrate the effectiveness of the automatic model finding process for layer replication, using both distributed data-parallel and concurrent training under different conditions. The accuracy of our model improved by up to 6% compared to the previous work on 3D MobileNetV2, and by 8% compared to the vanilla MobileNetV2. Training models with distributed data-parallel across four GPUs reduced model training time by up to 75% compared to traditional training on a single GPU. Additionally, the automatic model finding process with concurrent training was 1,932 minutes faster than the distributed training approach in finding an optimal solution.
在本文中,我们提出了一种减少模型架构搜索时间的方法。我们以 3D 人脸识别任务中的 MobileNetV2 为例,并引入了层复制来提高准确性。对于给定的网络,可以复制各种层,有效的复制可以产生更好的准确性。我们提出的算法确定了模型的最佳层复制配置。我们考虑了两种加速方法:分布式数据并行训练和并发模型训练。我们的实验证明了自动模型发现过程对于层复制的有效性,在不同条件下使用分布式数据并行和并发训练。与之前的 3D MobileNetV2 工作相比,我们的模型的准确性提高了 6%,与原始的 MobileNetV2 相比提高了 8%。与在单个 GPU 上进行传统训练相比,在四个 GPU 上进行分布式数据并行训练将模型训练时间减少了 75%。此外,与分布式训练方法相比,使用并发训练的自动模型发现过程在找到最佳解决方案时快了 1932 分钟。