Biroli Giulio, Bonnaire Tony, de Bortoli Valentin, Mézard Marc
Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris Cité, Paris, France.
Computer Science Department, ENS, CNRS, PSL University, Paris, France.
Nat Commun. 2024 Nov 17;15(1):9957. doi: 10.1038/s41467-024-54281-3.
We study generative diffusion models in the regime where both the data dimension and the sample size are large, and the score function is trained optimally. Using statistical physics methods, we identify three distinct dynamical regimes during the generative diffusion process. The generative dynamics, starting from pure noise, first encounters a speciation transition, where the broad structure of the data emerges, akin to symmetry breaking in phase transitions. This is followed by a collapse phase, where the dynamics is attracted to a specific training point through a mechanism similar to condensation in a glass phase. The speciation time can be obtained from a spectral analysis of the data's correlation matrix, while the collapse time relates to an excess entropy measure, and reveals the existence of a curse of dimensionality for diffusion models. These theoretical findings are supported by analytical solutions for Gaussian mixtures and confirmed by numerical experiments on real datasets.
我们研究在数据维度和样本量都很大且得分函数经过最优训练的情况下的生成扩散模型。使用统计物理方法,我们在生成扩散过程中识别出三种不同的动力学状态。从纯噪声开始的生成动力学首先会遇到一个物种形成转变,此时数据的宽泛结构出现,类似于相变中的对称性破缺。接着是一个坍缩阶段,在此阶段动力学通过一种类似于玻璃态凝聚的机制被吸引到一个特定的训练点。物种形成时间可以从数据相关矩阵的谱分析中获得,而坍缩时间与一个过剩熵度量相关,并揭示了扩散模型中维度诅咒的存在。这些理论发现得到了高斯混合模型的解析解的支持,并通过对真实数据集的数值实验得到了证实。