Suppr超能文献

深度强化学习中对对抗性状态不确定性的可验证鲁棒性

Certifiable Robustness to Adversarial State Uncertainty in Deep Reinforcement Learning.

作者信息

Everett Michael, Lutjens Bjorn, How Jonathan P

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Sep;33(9):4184-4198. doi: 10.1109/TNNLS.2021.3056046. Epub 2022 Aug 31.

Abstract

Deep neural network-based systems are now state-of-the-art in many robotics tasks, but their application in safety-critical domains remains dangerous without formal guarantees on network robustness. Small perturbations to sensor inputs (from noise or adversarial examples) are often enough to change network-based decisions, which was recently shown to cause an autonomous vehicle to swerve into another lane. In light of these dangers, numerous algorithms have been developed as defensive mechanisms from these adversarial inputs, some of which provide formal robustness guarantees or certificates. This work leverages research on certified adversarial robustness to develop an online certifiably robust for deep reinforcement learning algorithms. The proposed defense computes guaranteed lower bounds on state-action values during execution to identify and choose a robust action under a worst case deviation in input space due to possible adversaries or noise. Moreover, the resulting policy comes with a certificate of solution quality, even though the true state and optimal action are unknown to the certifier due to the perturbations. The approach is demonstrated on a deep Q-network (DQN) policy and is shown to increase robustness to noise and adversaries in pedestrian collision avoidance scenarios, a classic control task, and Atari Pong. This article extends our prior work with new performance guarantees, extensions to other reinforcement learning algorithms, expanded results aggregated across more scenarios, an extension into scenarios with adversarial behavior, comparisons with a more computationally expensive method, and visualizations that provide intuition about the robustness algorithm.

摘要

基于深度神经网络的系统目前在许多机器人任务中处于领先水平,但在没有网络鲁棒性的形式化保证的情况下,它们在安全关键领域的应用仍然存在危险。对传感器输入的微小扰动(来自噪声或对抗性示例)通常足以改变基于网络的决策,最近有研究表明,这会导致自动驾驶车辆驶入另一条车道。鉴于这些危险,人们开发了许多算法作为针对这些对抗性输入的防御机制,其中一些提供了形式化的鲁棒性保证或证书。这项工作利用了经认证的对抗鲁棒性研究,为深度强化学习算法开发了一种在线可认证鲁棒性方法。所提出的防御措施在执行过程中计算状态 - 动作值的有保证的下界,以便在由于可能的对手或噪声导致的输入空间最坏情况偏差下识别并选择一个鲁棒动作。此外,即使由于扰动认证器不知道真实状态和最优动作,所得到的策略也带有解决方案质量的证书。该方法在深度Q网络(DQN)策略上进行了演示,并在行人碰撞避免场景、经典控制任务和雅达利乒乓球游戏中显示出对噪声和对手的鲁棒性增强。本文通过新的性能保证、对其他强化学习算法的扩展、跨更多场景汇总的扩展结果、对具有对抗行为场景的扩展、与计算成本更高的方法的比较以及提供关于鲁棒性算法直观理解的可视化,扩展了我们之前的工作。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验