Department of Software & Communications Engineering, Hongik University, Sejong 30016, Republic of Korea.
Sensors (Basel). 2023 Apr 27;23(9):4347. doi: 10.3390/s23094347.
This paper delves into image detection based on distributed deep-learning techniques for intelligent traffic systems or self-driving cars. The accuracy and precision of neural networks deployed on edge devices (e.g., CCTV (closed-circuit television) for road surveillance) with small datasets may be compromised, leading to the misjudgment of targets. To address this challenge, TensorFlow and PyTorch were used to initialize various distributed model parallel and data parallel techniques. Despite the success of these techniques, communication constraints were observed along with certain speed issues. As a result, a hybrid pipeline was proposed, combining both dataset and model distribution through an all-reduced algorithm and NVlinks to prevent miscommunication among gradients. The proposed approach was tested on both an edge cluster and Google cluster environment, demonstrating superior performance compared to other test settings, with the quality of the bounding box detection system meeting expectations with increased reliability. Performance metrics, including total training time, images/second, cross-entropy loss, and total loss against the number of the epoch, were evaluated, revealing a robust competition between TensorFlow and PyTorch. The PyTorch environment's hybrid pipeline outperformed other test settings.
本文探讨了基于分布式深度学习技术的图像检测,用于智能交通系统或自动驾驶汽车。在边缘设备(例如道路监控用闭路电视 (CCTV))上部署的神经网络,在处理小数据集时,其准确性和精度可能会受到影响,导致目标误判。为了解决这个挑战,我们使用 TensorFlow 和 PyTorch 来初始化各种分布式模型并行和数据并行技术。尽管这些技术取得了成功,但我们观察到存在通信限制和一些速度问题。因此,我们提出了一种混合流水线,通过全归约算法和 NVlinks 将数据集和模型分布结合起来,以防止梯度之间的通信错误。我们在边缘集群和 Google 集群环境中对提出的方法进行了测试,结果表明,与其他测试设置相比,该方法具有优越的性能,边界框检测系统的质量达到了预期,可靠性也得到了提高。我们评估了包括总训练时间、每秒图像数、交叉熵损失和总损失与 epoch 数的关系等性能指标,结果表明 TensorFlow 和 PyTorch 之间存在激烈的竞争。PyTorch 环境的混合流水线在其他测试设置中表现出色。