Suppr超能文献

利用半导体激光器中混沌时间波形的偏差控制解决大规模多臂老虎机问题的决策方法。

Decision making for large-scale multi-armed bandit problems using bias control of chaotic temporal waveforms in semiconductor lasers.

作者信息

Morijiri Kensei, Mihana Takatomo, Kanno Kazutaka, Naruse Makoto, Uchida Atsushi

机构信息

Department of Information and Computer Sciences, Saitama University, 255 Shimo-okubo, Sakura-ku, Saitama City, Saitama, 338-8570, Japan.

Department of Information Physics and Computing, Graduate School of Information Science and Technology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8656, Japan.

出版信息

Sci Rep. 2022 May 16;12(1):8073. doi: 10.1038/s41598-022-12155-y.

Abstract

Decision making using photonic technologies has been intensively researched for solving the multi-armed bandit problem, which is fundamental to reinforcement learning. However, these technologies are yet to be extended to large-scale multi-armed bandit problems. In this study, we conduct a numerical investigation of decision making to solve large-scale multi-armed bandit problems by controlling the biases of chaotic temporal waveforms generated in semiconductor lasers with optical feedback. We generate chaotic temporal waveforms using the semiconductor lasers, and each waveform is assigned to a slot machine (or choice) in the multi-armed bandit problem. The biases in the amplitudes of the chaotic waveforms are adjusted based on rewards using the tug-of-war method. Subsequently, the slot machine that yields the maximum-amplitude chaotic temporal waveform with bias is selected. The scaling properties of the correct decision-making process are examined by increasing the number of slot machines to 1024, and the scaling exponent of the power-law distribution is 0.97. We demonstrate that the proposed method outperforms existing software algorithms in terms of the scaling exponent. This result paves the way for photonic decision making in large-scale multi-armed bandit problems using photonic accelerators.

摘要

利用光子技术进行决策已被深入研究,以解决多臂老虎机问题,这是强化学习的基础。然而,这些技术尚未扩展到大规模多臂老虎机问题。在本研究中,我们通过控制光反馈半导体激光器中产生的混沌时间波形的偏差,对解决大规模多臂老虎机问题的决策进行了数值研究。我们使用半导体激光器生成混沌时间波形,并且每个波形被分配到多臂老虎机问题中的一个老虎机(或选择)。基于奖励,使用拔河方法调整混沌波形幅度的偏差。随后,选择产生具有偏差的最大幅度混沌时间波形的老虎机。通过将老虎机数量增加到1024来检查正确决策过程的标度性质,幂律分布的标度指数为0.97。我们证明,所提出的方法在标度指数方面优于现有的软件算法。这一结果为使用光子加速器在大规模多臂老虎机问题中进行光子决策铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2227/9110346/cbf3c35396b2/41598_2022_12155_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验