Opt Express. 2022 Oct 24;30(22):39582-39596. doi: 10.1364/OE.471629.
Recently, deep reinforcement learning (DRL) for metasurface design has received increased attention for its excellent decision-making ability in complex problems. However, time-consuming numerical simulation has hindered the adoption of DRL-based design method. Here we apply the Deep learning-based virtual Environment Proximal Policy Optimization (DE-PPO) method to design the 3D chiral plasmonic metasurfaces for flexible targets and model the metasurface design process as a Markov decision process to help the training. A well trained DRL agent designs chiral metasurfaces that exhibit the optimal absolute circular dichroism value (typically, ∼ 0.4) at various target wavelengths such as 930 nm, 1000 nm, 1035 nm, and 1100 nm with great time efficiency. Besides, the training process of the PPO agent is exceptionally fast with the help of the deep neural network (DNN) auxiliary virtual environment. Also, this method changes all variable parameters of nanostructures simultaneously, reducing the size of the action vector and thus the output size of the DNN. Our proposed approach could find applications in efficient and intelligent design of nanophotonic devices.
最近,深度强化学习(DRL)在元表面设计中受到越来越多的关注,因为它在复杂问题中有出色的决策能力。然而,耗时的数值模拟阻碍了基于 DRL 的设计方法的采用。在这里,我们应用基于深度学习的虚拟环境近端策略优化(DE-PPO)方法来设计用于灵活目标的 3D 手性等离子体超表面,并将超表面设计过程建模为一个马尔可夫决策过程,以帮助训练。一个经过良好训练的 DRL 代理可以设计出在各种目标波长(例如 930nm、1000nm、1035nm 和 1100nm)下表现出最佳绝对圆二色性值(通常约为 0.4)的手性超表面,而且效率非常高。此外,在深度神经网络(DNN)辅助虚拟环境的帮助下,PPO 代理的训练过程异常快速。此外,该方法同时改变纳米结构的所有变量参数,减少动作向量的大小,从而减少 DNN 的输出大小。我们提出的方法可以在高效和智能的纳米光子器件设计中得到应用。