Berkenkamp Felix, Krause Andreas, Schoellig Angela P
Department of Computer Science, ETH Zurich, Zurich, Switzerland.
Institute for Aerospace Studies, University of Toronto, Toronto, Canada.
Mach Learn. 2023;112(10):3713-3747. doi: 10.1007/s10994-021-06019-1. Epub 2021 Jun 24.
Selecting the right tuning parameters for algorithms is a pravelent problem in machine learning that can significantly affect the performance of algorithms. Data-efficient optimization algorithms, such as Bayesian optimization, have been used to automate this process. During experiments on real-world systems such as robotic platforms these methods can evaluate unsafe parameters that lead to safety-critical system failures and can destroy the system. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in practice, since they are often opposing objectives. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.
为算法选择合适的调优参数是机器学习中一个普遍存在的问题,它会显著影响算法的性能。数据高效的优化算法,如贝叶斯优化,已被用于自动化这一过程。在诸如机器人平台等现实世界系统的实验中,这些方法可能会评估导致安全关键系统故障并可能破坏系统的不安全参数。最近,一种名为SafeOpt的安全贝叶斯优化算法已经被开发出来,它保证系统性能永远不会低于临界值;也就是说,安全是基于性能函数来定义的。然而,在实践中,将性能和安全耦合起来往往是不可取的,因为它们通常是相互对立的目标。在本文中,我们提出了一种广义算法,该算法允许将多个安全约束与目标分开。给定一组初始安全参数,该算法会最大化性能,但只评估极有可能满足所有约束条件下的安全性的参数。为此,它通过利用高斯过程先验中的正则性假设来仔细探索参数空间。此外,我们展示了上下文变量如何用于安全地将知识转移到新的情况和任务中。我们进行了理论分析,并证明了所提出的算法能够在四旋翼飞行器实验中快速、自动且安全地优化调优参数。