Wauthier Samuel T, De Boom Cedric, Çatal Ozan, Verbelen Tim, Dhoedt Bart
IDLab, Department of Information Technology, Ghent University-imec, Ghent, Belgium.
Front Neurorobot. 2022 Mar 11;16:795846. doi: 10.3389/fnbot.2022.795846. eCollection 2022.
Although still not fully understood, sleep is known to play an important role in learning and in pruning synaptic connections. From the active inference perspective, this can be cast as learning parameters of a generative model and Bayesian model reduction, respectively. In this article, we show how to reduce dimensionality of the latent space of such a generative model, and hence model complexity, in deep active inference during training through a similar process. While deep active inference uses deep neural networks for state space construction, an issue remains in that the dimensionality of the latent space must be specified beforehand. We investigate two methods that are able to prune the latent space of deep active inference models. The first approach functions similar to sleep and performs model reduction . The second approach is a novel method which is more similar to reflection, operates during training and displays "aha" moments when the model is able to reduce latent space dimensionality. We show for two well-known simulated environments that model performance is retained in the first approach and only diminishes slightly in the second approach. We also show that reconstructions from a real world example are indistinguishable before and after reduction. We conclude that the most important difference constitutes a trade-off between training time and model performance in terms of accuracy and the ability to generalize, minimization of model complexity.
尽管睡眠的机制仍未完全被理解,但已知其在学习和修剪突触连接方面发挥着重要作用。从主动推理的角度来看,这可以分别被视为生成模型的学习参数和贝叶斯模型简化。在本文中,我们展示了如何在训练过程中通过类似的过程,在深度主动推理中降低这种生成模型潜在空间的维度,从而降低模型复杂度。虽然深度主动推理使用深度神经网络来构建状态空间,但一个问题仍然存在,即潜在空间的维度必须事先指定。我们研究了两种能够修剪深度主动推理模型潜在空间的方法。第一种方法的功能类似于睡眠,执行模型简化。第二种方法是一种新颖的方法,更类似于反思,在训练期间运行,并在模型能够降低潜在空间维度时显示出“顿悟”时刻。我们针对两个著名的模拟环境表明,在第一种方法中模型性能得以保留,而在第二种方法中仅略有下降。我们还表明,来自真实世界示例的重建在简化前后是无法区分的。我们得出结论,最重要的差异在于训练时间与模型性能之间在准确性、泛化能力以及模型复杂度最小化方面的权衡。