Ferraro Stefano, Van de Maele Toon, Mazzaglia Pietro, Verbelen Tim, Dhoedt Bart
IDLab, Ghent University, 25843 Ghent, Belgium.
Sensors (Basel). 2022 Sep 28;22(19):7382. doi: 10.3390/s22197382.
The robotics field has been deeply influenced by the advent of deep learning. In recent years, this trend has been characterized by the adoption of large, pretrained models for robotic use cases, which are not compatible with the computational hardware available in robotic systems. Moreover, such large, computationally intensive models impede the low-latency execution which is required for many closed-loop control systems. In this work, we propose different strategies for improving the computational efficiency of the deep-learning models adopted in reinforcement-learning (RL) scenarios. As a use-case project, we consider an image-based RL method on the synergy between push-and-grasp actions. As a first optimization step, we reduce the model architecture in complexity, by decreasing the number of layers and by altering the architecture structure. Second, we consider downscaling the input resolution to reduce the computational load. Finally, we perform weight quantization, where we compare post-training quantization and quantized-aware training. We benchmark the improvements introduced in each optimization by running a standard testing routine. We show that the optimization strategies introduced can improve the computational efficiency by around 300 times, while also slightly improving the functional performance of the system. In addition, we demonstrate closed-loop control behaviour on a real-world robot, while processing everything on a Jetson Xavier NX edge device.
机器人技术领域深受深度学习出现的影响。近年来,这种趋势的特点是在机器人用例中采用大型预训练模型,而这些模型与机器人系统中可用的计算硬件不兼容。此外,这种大型、计算密集型模型阻碍了许多闭环控制系统所需的低延迟执行。在这项工作中,我们提出了不同的策略来提高强化学习(RL)场景中采用的深度学习模型的计算效率。作为一个用例项目,我们考虑一种基于图像的RL方法,用于推和抓动作之间的协同作用。作为第一步优化,我们通过减少层数和改变架构结构来降低模型架构的复杂性。其次,我们考虑降低输入分辨率以减少计算负载。最后,我们进行权重量化,比较训练后量化和量化感知训练。我们通过运行标准测试程序来对每次优化中引入的改进进行基准测试。我们表明,引入的优化策略可以将计算效率提高约300倍,同时还能略微提高系统的功能性能。此外,我们在实际机器人上展示了闭环控制行为,同时在Jetson Xavier NX边缘设备上处理所有事情。