Xiao Jiaping, Pisutsin Phumrapee, Feroskhan Mir
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):313-327. doi: 10.1109/TNNLS.2023.3331370. Epub 2025 Jan 7.
Equipping drones with target search capabilities is highly desirable for applications in disaster rescue and smart warehouse delivery systems. Multiple intelligent drones that can collaborate with each other and maneuver among obstacles show more effectiveness in accomplishing tasks in a shorter amount of time. However, carrying out collaborative target search (CTS) without prior target information is extremely challenging, especially with a visual drone swarm. In this work, we propose a novel data-efficient deep reinforcement learning (DRL) approach called adaptive curriculum embedded multistage learning (ACEMSL) to address these challenges, mainly 3-D sparse reward space exploration with limited visual perception and collaborative behavior requirements. Specifically, we decompose the CTS task into several subtasks including individual obstacle avoidance, target search, and inter-agent collaboration, and progressively train the agents with multistage learning. Meanwhile, an adaptive embedded curriculum (AEC) is designed, where the task difficulty level (TDL) can be adaptively adjusted based on the success rate (SR) achieved in training. ACEMSL allows data-efficient training and individual-team reward allocation for the visual drone swarm. Furthermore, we deploy the trained model over a real visual drone swarm and perform CTS operations without fine-tuning. Extensive simulations and real-world flight tests validate the effectiveness and generalizability of ACEMSL. The project is available at https://github.com/NTU-UAVG/CTS-visual-drone-swarm.git.
为灾难救援和智能仓库配送系统中的应用配备具有目标搜索能力的无人机是非常有必要的。多个能够相互协作并在障碍物间灵活机动的智能无人机在更短时间内完成任务时表现出更高的效率。然而,在没有先验目标信息的情况下进行协同目标搜索(CTS)极具挑战性,尤其是对于视觉无人机群而言。在这项工作中,我们提出了一种名为自适应课程嵌入多阶段学习(ACEMSL)的新型数据高效深度强化学习(DRL)方法来应对这些挑战,主要是解决有限视觉感知下的三维稀疏奖励空间探索以及协作行为要求。具体而言,我们将CTS任务分解为多个子任务,包括个体避障、目标搜索和智能体间协作,并通过多阶段学习逐步训练智能体。同时,设计了一种自适应嵌入课程(AEC),其中任务难度级别(TDL)可根据训练中达到的成功率(SR)进行自适应调整。ACEMSL允许对视觉无人机群进行数据高效训练和个体 - 团队奖励分配。此外,我们将训练好的模型部署到实际的视觉无人机群上并进行CTS操作,无需微调。大量的模拟和实际飞行测试验证了ACEMSL的有效性和通用性。该项目可在https://github.com/NTU-UAVG/CTS-visual-drone-swarm.git获取。