基于光电储能计算的光子强化学习。

Photonic reinforcement learning based on optoelectronic reservoir computing.

作者信息

Kanno Kazutaka, Uchida Atsushi

机构信息

Department of Information and Computer Sciences, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama City, Saitama, 338-8570, Japan.

出版信息

Sci Rep. 2022 Mar 8;12(1):3720. doi: 10.1038/s41598-022-07404-z.

DOI:10.1038/s41598-022-07404-z

PMID:35260595

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8904492/

Abstract

Reinforcement learning has been intensively investigated and developed in artificial intelligence in the absence of training data, such as autonomous driving vehicles, robot control, internet advertising, and elastic optical networks. However, the computational cost of reinforcement learning with deep neural networks is extremely high and reducing the learning cost is a challenging issue. We propose a photonic on-line implementation of reinforcement learning using optoelectronic delay-based reservoir computing, both experimentally and numerically. In the proposed scheme, we accelerate reinforcement learning at a rate of several megahertz because there is no required learning process for the internal connection weights in reservoir computing. We perform two benchmark tasks, CartPole-v0 and MountanCar-v0 tasks, to evaluate the proposed scheme. Our results represent the first hardware implementation of reinforcement learning based on photonic reservoir computing and pave the way for fast and efficient reinforcement learning as a novel photonic accelerator.

摘要

在缺乏训练数据的情况下，强化学习在人工智能领域得到了深入研究和发展，例如自动驾驶车辆、机器人控制、互联网广告和弹性光网络。然而，基于深度神经网络的强化学习的计算成本极高，降低学习成本是一个具有挑战性的问题。我们通过实验和数值模拟，提出了一种使用基于光电延迟的储层计算的光子在线强化学习实现方案。在所提出的方案中，由于储层计算中内部连接权重无需学习过程，我们以几兆赫兹的速率加速强化学习。我们执行了两个基准任务，即CartPole-v0和MountanCar-v0任务，以评估所提出的方案。我们的结果代表了基于光子储层计算的强化学习的首次硬件实现，并为作为一种新型光子加速器的快速高效强化学习铺平了道路。