基于强化学习的光子储层计算中的自适应模型选择

Adaptive model selection in photonic reservoir computing by reinforcement learning.

作者信息

Kanno Kazutaka, Naruse Makoto, Uchida Atsushi

机构信息

Department of Information and Computer Sciences, Saitama University 255 Shimo-Okubo, Sakura-ku, Saitama City, Saitama, 338-8570, Japan.

Department of Information Physics and Computing, Graduate School of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8654, Japan.

出版信息

Sci Rep. 2020 Jun 22;10(1):10062. doi: 10.1038/s41598-020-66441-8.

DOI:10.1038/s41598-020-66441-8

PMID:32572093

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7308406/

Abstract

Photonic reservoir computing is an emergent technology toward beyond-Neumann computing. Although photonic reservoir computing provides superior performance in environments whose characteristics are coincident with the training datasets for the reservoir, the performance is significantly degraded if these characteristics deviate from the original knowledge used in the training phase. Here, we propose a scheme of adaptive model selection in photonic reservoir computing using reinforcement learning. In this scheme, a temporal waveform is generated by different dynamic source models that change over time. The system autonomously identifies the best source model for the task of time series prediction using photonic reservoir computing and reinforcement learning. We prepare two types of output weights for the source models, and the system adaptively selected the correct model using reinforcement learning, where the prediction errors are associated with rewards. We succeed in adaptive model selection when the source signal is temporally mixed, having originally been generated by two different dynamic system models, as well as when the signal is a mixture from the same model but with different parameter values. This study paves the way for autonomous behavior in photonic artificial intelligence and could lead to new applications in load forecasting and multi-objective control, where frequent environment changes are expected.

摘要

光子储层计算是一种迈向超越冯·诺依曼计算的新兴技术。尽管光子储层计算在其特性与储层训练数据集相符的环境中能提供卓越性能，但如果这些特性偏离训练阶段所使用的原始知识，性能就会显著下降。在此，我们提出一种利用强化学习在光子储层计算中进行自适应模型选择的方案。在该方案中，由随时间变化的不同动态源模型生成一个时间波形。系统利用光子储层计算和强化学习自主识别用于时间序列预测任务的最佳源模型。我们为源模型准备了两种类型的输出权重，系统利用强化学习自适应地选择正确模型，其中预测误差与奖励相关联。当源信号在时间上混合时，最初由两个不同动态系统模型生成，以及当信号是来自同一模型但具有不同参数值的混合信号时，我们都成功实现了自适应模型选择。本研究为光子人工智能中的自主行为铺平了道路，并可能在负荷预测和多目标控制等预期环境频繁变化的新应用中有所应用。