通过纳入已知物理规律对递归神经网络进行路径采样。
Path sampling of recurrent neural networks by incorporating known physics.
机构信息
Department of Physics, University of Maryland, College Park, MD, 20742, USA.
Institute for Physical Science and Technology, University of Maryland, College Park, MD, 20742, USA.
出版信息
Nat Commun. 2022 Nov 24;13(1):7231. doi: 10.1038/s41467-022-34780-x.
Recurrent neural networks have seen widespread use in modeling dynamical systems in varied domains such as weather prediction, text prediction and several others. Often one wishes to supplement the experimentally observed dynamics with prior knowledge or intuition about the system. While the recurrent nature of these networks allows them to model arbitrarily long memories in the time series used in training, it makes it harder to impose prior knowledge or intuition through generic constraints. In this work, we present a path sampling approach based on principle of Maximum Caliber that allows us to include generic thermodynamic or kinetic constraints into recurrent neural networks. We show the method here for a widely used type of recurrent neural network known as long short-term memory network in the context of supplementing time series collected from different application domains. These include classical Molecular Dynamics of a protein and Monte Carlo simulations of an open quantum system continuously losing photons to the environment and displaying Rabi oscillations. Our method can be easily generalized to other generative artificial intelligence models and to generic time series in different areas of physical and social sciences, where one wishes to supplement limited data with intuition or theory based corrections.
递归神经网络在各种领域的动态系统建模中得到了广泛的应用,如天气预报、文本预测等。通常,人们希望在实验观测到的动力学的基础上,补充关于系统的先验知识或直觉。虽然这些网络的递归性质允许它们在训练中使用的时间序列中对任意长时间的记忆进行建模,但这使得通过通用约束来施加先验知识或直觉变得更加困难。在这项工作中,我们提出了一种基于最大口径原理的路径采样方法,该方法允许我们将通用热力学或动力学约束纳入递归神经网络中。我们在此展示了一种广泛使用的递归神经网络,即长短期记忆网络,用于补充来自不同应用领域的时间序列。这些包括经典的蛋白质分子动力学和连续向环境失去光子并显示拉比振荡的开放量子系统的蒙特卡罗模拟。我们的方法可以很容易地推广到其他生成式人工智能模型和不同物理和社会科学领域的通用时间序列,在这些领域中,人们希望用基于直觉或理论的校正来补充有限的数据。