Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Hvidovre, Kettegard Allé 30, 2650, Hvidovre, Denmark.
Group for Neural Theory, LNC INSERM U960, DEC École Normale Supérieure PSL University, Paris, France; Center for Cognition and Decision Making, Institute for Cognitive Neuroscience, NRU Higher School of Economics, Moscow, Russia.
Phys Life Rev. 2019 Dec;31:214-232. doi: 10.1016/j.plrev.2019.07.005. Epub 2019 Jul 19.
Homeostasis is a problem for all living agents. It entails predictively regulating internal states within the bounds compatible with survival in order to maximise fitness. This can be achieved physiologically, through complex hierarchies of autonomic regulation, but it must also be achieved via behavioural control, both reactive and proactive. Here we briefly review some of the major theories of homeostatic control and their historical cognates, addressing how they tackle the optimisation of both physiological and behavioural homeostasis. We start with optimal control approaches, setting up key concepts, exploring their strengths and limitations. We then concentrate on contemporary neurocomputational approaches to homeostatic control. We primarily focus on a branch of reinforcement learning known as homeostatic reinforcement learning (HRL). A central premise of HRL is that reward optimisation is directly coupled to homeostatic control. A central construct in this framework is the drive function which maps from homeostatic state to motivational drive, where reductions in drive are operationally defined as reward values. We explain HRL's main advantages, empirical applications, and conceptual insights. Notably, we show how simple constraints on the drive function can yield a normative account of predictive control, as well as account for phenomena such as satiety, risk aversion, and interactions between competing homeostatic needs. We illustrate how HRL agents can learn to avoid hazardous states without any need to experience them, and how HRL can be applied in clinical domains. Finally, we outline several challenges to HRL, and how survival constraints and active inference models could circumvent these problems.
稳态是所有生物的一个问题。它需要预测性地调节内部状态,使其处于与生存兼容的范围内,以最大限度地提高适应性。这可以通过生理上的复杂自主调节层次结构来实现,但也必须通过行为控制来实现,包括反应性和主动性控制。在这里,我们简要回顾了一些主要的稳态控制理论及其历史同源理论,探讨了它们如何解决生理和行为稳态的优化问题。我们首先从最优控制方法入手,提出关键概念,探讨它们的优缺点。然后,我们集中讨论当代神经计算稳态控制方法。我们主要关注强化学习的一个分支,即稳态强化学习(HRL)。HRL 的一个核心前提是,奖励优化与稳态控制直接相关。该框架中的一个核心结构是驱动力函数,它将稳态状态映射到动机驱动力,其中驱动力的降低在操作上被定义为奖励值。我们解释了 HRL 的主要优点、实证应用和概念见解。值得注意的是,我们展示了简单的驱动力函数约束如何产生预测控制的规范解释,以及如何解释饱腹感、风险厌恶和竞争稳态需求之间的相互作用等现象。我们说明了 HRL 代理如何在无需体验的情况下学会避免危险状态,以及 HRL 如何应用于临床领域。最后,我们概述了 HRL 的几个挑战,以及生存约束和主动推理模型如何规避这些问题。