Tausani Lorenzo, Testolin Alberto, Zorzi Marco
Department of General Psychology and Padova Neuroscience Center, University of Padova, Padova, Italy.
Department of Mathematics, University of Padova, Padova, Italy.
Sci Rep. 2025 Jan 22;15(1):2875. doi: 10.1038/s41598-024-85055-y.
Hierarchical generative models can produce data samples based on the statistical structure of their training distribution. This capability can be linked to current theories in computational neuroscience, which propose that spontaneous brain activity at rest is the manifestation of top-down dynamics of generative models detached from action-perception cycles. A popular class of hierarchical generative models is that of Deep Belief Networks (DBNs), which are energy-based deep learning architectures that can learn multiple levels of representations in a completely unsupervised way exploiting Hebbian-like learning mechanisms. In this work, we study the generative dynamics of a recent extension of the DBN, the iterative DBN (iDBN), which more faithfully simulates neurocognitive development by jointly tuning the connection weights across all layers of the hierarchy. We characterize the number of states visited during top-down sampling and investigate whether the heterogeneity of visited attractors could be increased by initiating the generation process from biased hidden states. To this end, we train iDBN models on well-known datasets containing handwritten digits and pictures of human faces, and show that the ability to generate diverse data prototypes can be enhanced by initializing top-down sampling from "chimera states", which represent high-level features combining multiple abstract representations of the sensory data. Although the models are not always able to transition between all potential target states within a single-generation trajectory, the iDBN shows richer top-down dynamics in comparison to a shallow generative model (a single-layer Restricted Bolzamann Machine). We further show that the generated samples can be used to support continual learning through generative replay mechanisms. Our findings suggest that the top-down dynamics of hierarchical generative models is significantly influenced by the shape of the energy function, which depends both on the depth of the processing architecture and on the statistical structure of the sensory data.
分层生成模型可以根据其训练分布的统计结构生成数据样本。这种能力可以与计算神经科学中的当前理论联系起来,这些理论提出,静息时的自发脑活动是脱离动作 - 感知循环的生成模型自上而下动态的表现。一类流行的分层生成模型是深度信念网络(DBN),它是基于能量的深度学习架构,可以利用类似赫布学习机制以完全无监督的方式学习多个层次的表示。在这项工作中,我们研究了DBN的一种最新扩展——迭代DBN(iDBN)的生成动态,它通过联合调整层次结构所有层的连接权重,更忠实地模拟神经认知发展。我们刻画了自上而下采样过程中访问的状态数量,并研究了通过从有偏隐藏状态启动生成过程,是否可以增加访问吸引子的异质性。为此,我们在包含手写数字和人脸图片的知名数据集上训练iDBN模型,并表明通过从“嵌合状态”初始化自上而下采样,可以增强生成多样化数据原型的能力,“嵌合状态”代表结合了感官数据多个抽象表示的高级特征。尽管模型并不总是能够在单代轨迹内从所有潜在目标状态之间进行转换,但与浅层生成模型(单层受限玻尔兹曼机)相比,iDBN显示出更丰富的自上而下动态。我们进一步表明,生成的样本可用于通过生成重放机制支持持续学习。我们的研究结果表明,分层生成模型的自上而下动态受到能量函数形状的显著影响,而能量函数形状既取决于处理架构的深度,也取决于感官数据的统计结构。