Sanders Julia, Muratore-Ginanneschi Paolo
Department of Mathematics and Statistics, University of Helsinki, 00014 Helsinki, Finland.
Entropy (Basel). 2025 Feb 20;27(3):218. doi: 10.3390/e27030218.
Optimal control theory aims to find an optimal protocol to steer a system between assigned boundary conditions while minimizing a given cost functional in finite time. Equations arising from these types of problems are often non-linear and difficult to solve numerically. In this article, we describe numerical methods of integration for two partial differential equations that commonly arise in optimal control theory: the Fokker-Planck equation driven by a mechanical potential for which we use the Girsanov theorem; and the Hamilton-Jacobi-Bellman, or dynamic programming, equation for which we find the gradient of its solution using the Bismut-Elworthy-Li formula. The computation of the gradient is necessary to specify the optimal protocol. Finally, we give an example application of the numerical techniques to solving an optimal control problem without spacial discretization using machine learning.
最优控制理论旨在找到一种最优方案,以便在有限时间内将系统引导至指定边界条件,同时使给定的成本泛函最小化。这类问题产生的方程通常是非线性的,并且难以进行数值求解。在本文中,我们描述了最优控制理论中常见的两个偏微分方程的数值积分方法:一个是由机械势驱动的福克 - 普朗克方程,我们使用吉尔萨诺夫定理来处理;另一个是哈密顿 - 雅可比 - 贝尔曼方程(即动态规划方程),我们使用比斯穆特 - 埃尔沃西 - 李公式来求其解的梯度。梯度的计算对于确定最优方案是必要的。最后,我们给出了一个数值技术的示例应用,即使用机器学习在不进行空间离散化的情况下求解一个最优控制问题。