Suppr超能文献

学习可解释的任务相关状态表示,用于无模型深度强化学习。

Learning explainable task-relevant state representation for model-free deep reinforcement learning.

机构信息

College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, China; RIKEN Center for Advanced Intelligence Project (AIP), Tokyo, Japan.

College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, China.

出版信息

Neural Netw. 2024 Dec;180:106741. doi: 10.1016/j.neunet.2024.106741. Epub 2024 Sep 20.

Abstract

State representations considerably accelerate learning speed and improve data efficiency for deep reinforcement learning (DRL), especially for visual tasks. Task-relevant state representations could focus on features relevant to the task, filter out irrelevant elements, and thus further improve performance. However, task-relevant representations are typically obtained through model-based DRL methods, which involves the challenging task of learning a transition function. Moreover, inaccuracies in the learned transition function can potentially lead to performance degradation and negatively impact the learning of the policy. In this paper, to address the above issue, we propose a novel method of explainable task-relevant state representation (ETrSR) for model-free DRL that is direct, robust, and without any requirement of learning of a transition model. More specifically, the proposed ETrSR first disentangles the features from the states based on the beta variational autoencoder (β-VAE). Then, a reward prediction model is employed to bootstrap these features to be relevant to the task, and the explainable states can be obtained by decoding the task-related features. Finally, we validate our proposed method on the CarRacing environment and various tasks in the DeepMind control suite (DMC), which demonstrates the explainability for better understanding of the decision-making process and the outstanding performance of the proposed method even in environments with strong distractions.

摘要

状态表示极大地加速了深度强化学习(DRL)的学习速度并提高了数据效率,特别是对于视觉任务。与任务相关的状态表示可以专注于与任务相关的特征,过滤掉不相关的元素,从而进一步提高性能。然而,与任务相关的表示通常是通过基于模型的 DRL 方法获得的,这涉及到学习转移函数的挑战性任务。此外,学习到的转移函数的不准确性可能导致性能下降,并对策略的学习产生负面影响。在本文中,为了解决上述问题,我们提出了一种新的无模型 DRL 的可解释任务相关状态表示(ETrSR)方法,该方法直接、稳健,且无需学习转移模型。更具体地说,所提出的 ETrSR 首先基于β变分自动编码器(β-VAE)从状态中分离出特征。然后,使用奖励预测模型来引导这些特征与任务相关联,并通过解码与任务相关的特征来获得可解释的状态。最后,我们在 CarRacing 环境和 DeepMind 控制套件(DMC)中的各种任务上验证了我们提出的方法,这表明了该方法的可解释性,有助于更好地理解决策过程,并且即使在具有强烈干扰的环境中,该方法也具有出色的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验