Suppr超能文献

基于投影模拟的强化学习在马尔可夫决策过程中的收敛性

On the convergence of projective-simulation-based reinforcement learning in Markov decision processes.

作者信息

Boyajian W L, Clausen J, Trenkwalder L M, Dunjko V, Briegel H J

机构信息

Institute for Theoretical Physics, University of Innsbruck, 6020 Innsbruck, Austria.

LIACS, Leiden University, Niels Bohrweg 1, 2333 CA Leiden, The Netherlands.

出版信息

Quantum Mach Intell. 2020;2(2):13. doi: 10.1007/s42484-020-00023-9. Epub 2020 Nov 5.

Abstract

In recent years, the interest in leveraging quantum effects for enhancing machine learning tasks has significantly increased. Many algorithms speeding up supervised and unsupervised learning were established. The first framework in which ways to exploit quantum resources specifically for the broader context of reinforcement learning were found is projective simulation. Projective simulation presents an agent-based reinforcement learning approach designed in a manner which may support quantum walk-based speedups. Although classical variants of projective simulation have been benchmarked against common reinforcement learning algorithms, very few formal theoretical analyses have been provided for its performance in standard learning scenarios. In this paper, we provide a detailed formal discussion of the properties of this model. Specifically, we prove that one version of the projective simulation model, understood as a reinforcement learning approach, converges to optimal behavior in a large class of Markov decision processes. This proof shows that a physically inspired approach to reinforcement learning can guarantee to converge.

摘要

近年来,利用量子效应来增强机器学习任务的兴趣显著增加。许多加速监督学习和无监督学习的算法被建立起来。第一个专门针对强化学习更广泛背景探索量子资源的框架是投影模拟。投影模拟提出了一种基于智能体的强化学习方法,其设计方式可能支持基于量子行走的加速。尽管投影模拟的经典变体已与常见的强化学习算法进行了基准测试,但对于其在标准学习场景中的性能,很少有正式的理论分析。在本文中,我们对该模型的属性进行了详细的形式化讨论。具体而言,我们证明了投影模拟模型的一个版本,作为一种强化学习方法,在一大类马尔可夫决策过程中收敛到最优行为。这一证明表明,一种受物理启发的强化学习方法能够保证收敛。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4e7/7644479/ba6c6002733e/42484_2020_23_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验