van de Laar Thijs W, de Vries Bert
Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands.
GN Hearing Benelux BV, Eindhoven, Netherlands.
Front Robot AI. 2019 Mar 28;6:20. doi: 10.3389/frobt.2019.00020. eCollection 2019.
The free energy principle (FEP) offers a variational calculus-based description for how biological agents persevere through interactions with their environment. Active inference (AI) is a corollary of the FEP, which states that biological agents act to fulfill prior beliefs about preferred future observations (target priors). Purposeful behavior then results from variational free energy minimization with respect to a generative model of the environment with included target priors. However, manual derivations for free energy minimizing algorithms on custom dynamic models can become tedious and error-prone. While probabilistic programming (PP) techniques enable automatic derivation of inference algorithms on free-form models, full automation of AI requires specialized tools for inference on dynamic models, together with the description of an experimental protocol that governs the interaction between the agent and its simulated environment. The contributions of the present paper are two-fold. Firstly, we illustrate how AI can be automated with the use of ForneyLab, a recent PP toolbox that specializes in variational inference on flexibly definable dynamic models. More specifically, we describe AI agents in a dynamic environment as probabilistic state space models (SSM) and perform inference for perception and control in these agents by message passing on a factor graph representation of the SSM. Secondly, we propose a formal experimental protocol for simulated AI. We exemplify how this protocol leads to goal-directed behavior for flexibly definable AI agents in two classical RL examples, namely the Bayesian thermostat and the mountain car parking problems.
自由能量原理(FEP)提供了一种基于变分计算的描述,用于解释生物主体如何通过与环境的相互作用来维持自身。主动推理(AI)是FEP的一个推论,它表明生物主体的行为是为了实现关于偏好未来观测(目标先验)的先验信念。有目的的行为随后源于相对于包含目标先验的环境生成模型的变分自由能量最小化。然而,在自定义动态模型上手动推导自由能量最小化算法可能会变得繁琐且容易出错。虽然概率编程(PP)技术能够在自由形式模型上自动推导推理算法,但AI的完全自动化需要用于动态模型推理的专门工具,以及描述主体与其模拟环境之间相互作用的实验协议。本文的贡献有两个方面。首先,我们说明了如何使用ForneyLab(一个最近专门用于对灵活定义的动态模型进行变分推理的PP工具箱)来实现AI的自动化。更具体地说,我们将动态环境中的AI主体描述为概率状态空间模型(SSM),并通过在SSM的因子图表示上进行消息传递来对这些主体中的感知和控制进行推理。其次,我们为模拟AI提出了一个正式的实验协议。我们通过两个经典的强化学习示例,即贝叶斯恒温器和山地车停车问题,举例说明了该协议如何导致灵活定义的AI主体产生目标导向行为。