Pagnozzi Federico, Birattari Mauro
IRIDIA, Université libre de Bruxelles, Brussels, Belgium.
Front Robot AI. 2021 Apr 29;8:625125. doi: 10.3389/frobt.2021.625125. eCollection 2021.
Due to the decentralized, loosely coupled nature of a swarm and to the lack of a general design methodology, the development of control software for robot swarms is typically an iterative process. Control software is generally modified and refined repeatedly, either manually or automatically, until satisfactory results are obtained. In this paper, we propose a technique based on off-policy evaluation to estimate how the performance of an instance of control software-implemented as a probabilistic finite-state machine-would be impacted by modifying the structure and the value of the parameters. The proposed technique is particularly appealing when coupled with automatic design methods belonging to the AutoMoDe family, as it can exploit the data generated during the design process. The technique can be used either to reduce the complexity of the control software generated, improving therefore its readability, or to evaluate perturbations of the parameters, which could help in prioritizing the exploration of the neighborhood of the current solution within an iterative improvement algorithm. To evaluate the technique, we apply it to control software generated with an AutoMoDe method, . In a first experiment, we use the proposed technique to estimate the impact of removing a state from a probabilistic finite-state machine. In a second experiment, we use it to predict the impact of changing the value of the parameters. The results show that the technique is promising and significantly better than a naive estimation. We discuss the limitations of the current implementation of the technique, and we sketch possible improvements, extensions, and generalizations.
由于群体的分散性、松散耦合性以及缺乏通用的设计方法,机器人群体控制软件的开发通常是一个迭代过程。控制软件通常需要手动或自动反复修改和完善,直到获得满意的结果。在本文中,我们提出了一种基于离策略评估的技术,以估计作为概率有限状态机实现的控制软件实例的性能将如何受到修改结构和参数值的影响。当与属于AutoMoDe家族的自动设计方法相结合时,所提出的技术特别有吸引力,因为它可以利用设计过程中生成的数据。该技术既可以用于降低生成的控制软件的复杂性,从而提高其可读性,也可以用于评估参数的扰动,这有助于在迭代改进算法中确定当前解决方案邻域探索的优先级。为了评估该技术,我们将其应用于用AutoMoDe方法生成的控制软件。在第一个实验中,我们使用所提出的技术来估计从概率有限状态机中移除一个状态的影响。在第二个实验中,我们用它来预测改变参数值的影响。结果表明,该技术很有前景,并且明显优于简单估计。我们讨论了该技术当前实现的局限性,并概述了可能的改进、扩展和推广。