Institut für Theoretische Physik, Technische Universität Berlin, Hardenbergstr. 36, 10623, Berlin, Germany.
Eur Phys J E Soft Matter. 2023 Jun 19;46(6):48. doi: 10.1140/epje/s10189-023-00309-3.
We employ Q learning, a variant of reinforcement learning, so that an active particle learns by itself to navigate on the fastest path toward a target while experiencing external forces and flow fields. As state variables, we use the distance and direction toward the target, and as action variables the active particle can choose a new orientation along which it moves with constant velocity. We explicitly investigate optimal navigation in a potential barrier/well and a uniform/ Poiseuille/swirling flow field. We show that Q learning is able to identify the fastest path and discuss the results. We also demonstrate that Q learning and applying the learned policy works when the particle orientation experiences thermal noise. However, the successful outcome strongly depends on the specific problem and the strength of noise.
我们采用强化学习的一种变体 Q 学习,让主动粒子在体验外力和流场的同时,通过自身学习找到通往目标的最快路径。作为状态变量,我们使用距离和朝向目标的方向,作为动作变量,主动粒子可以选择沿着以恒定速度移动的新方向。我们明确研究了在势垒/势阱和均匀/泊肃叶/旋流流场中的最优导航。我们表明 Q 学习能够识别最快路径,并讨论了结果。我们还证明了当粒子方向经历热噪声时,Q 学习和应用所学策略是有效的。然而,成功的结果强烈取决于具体问题和噪声的强度。