Heuthe Veit-Lorenz, Panizon Emanuele, Gu Hongri, Bechinger Clemens
Department of Physics, University of Konstanz, Universitaetsstrasse 10, Konstanz, 78464, Germany.
Centre for the Advanced Study of Collective Behaviour, Universitaetsstrasse 10, Konstanz, 78464, Germany.
Sci Robot. 2024 Dec 18;9(97):eado5888. doi: 10.1126/scirobotics.ado5888.
Swarm robots offer fascinating opportunities to perform complex tasks beyond the capabilities of individual machines. Just as a swarm of ants collectively moves large objects, similar functions can emerge within a group of robots through individual strategies based on local sensing. However, realizing collective functions with individually controlled microrobots is particularly challenging because of their micrometer size, large number of degrees of freedom, strong thermal noise relative to the propulsion speed, and complex physical coupling between neighboring microrobots. Here, we implemented multiagent reinforcement learning (MARL) to generate a control strategy for up to 200 microrobots whose motions are individually controlled by laser spots. During the learning process, we used so-called counterfactual rewards that automatically assign credit to the individual microrobots, which allows fast and unbiased training. With the help of this efficient reward scheme, swarm microrobots learn to collectively transport a large cargo object to an arbitrary position and orientation, similar to ant swarms. We show that this flexible and versatile swarm robotic system is robust to variations in group size, the presence of malfunctioning units, and environmental noise. In addition, we let the robot swarms manipulate multiple objects simultaneously in a demonstration experiment, highlighting the benefits of distributed control and independent microrobot motion. Control strategies such as ours can potentially enable complex and automated assembly of mobile micromachines, programmable drug delivery capsules, and other advanced lab-on-a-chip applications.
群体机器人为执行单个机器无法完成的复杂任务提供了迷人的机会。正如一群蚂蚁能共同移动大型物体一样,通过基于局部感知的个体策略,一组机器人也能展现出类似功能。然而,对于个体可控的微型机器人而言,要实现集体功能尤其具有挑战性,这是因为它们尺寸微小、自由度多、相对于推进速度的热噪声大,以及相邻微型机器人之间存在复杂的物理耦合。在此,我们实施了多智能体强化学习(MARL),为多达200个微型机器人生成控制策略,这些微型机器人的运动由激光点单独控制。在学习过程中,我们使用了所谓的反事实奖励,它能自动将功劳归于各个微型机器人,从而实现快速且无偏差的训练。借助这种高效的奖励机制,群体微型机器人学会了将一个大型货物集体运送到任意位置和方向,类似于蚁群。我们表明,这种灵活通用的群体机器人系统对于群体规模的变化、故障单元的存在以及环境噪声具有鲁棒性。此外,在一个演示实验中,我们让机器人群体同时操控多个物体,突出了分布式控制和微型机器人独立运动的优势。像我们这样的控制策略有可能实现移动微机器、可编程药物递送胶囊以及其他先进的芯片实验室应用的复杂自动化组装。