Ghio Davide, Dandi Yatin, Krzakala Florent, Zdeborová Lenka
Information, Learning and Physics Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland.
Statistical Physics of Computation Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne CH-1015, Switzerland.
Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2311810121. doi: 10.1073/pnas.2311810121. Epub 2024 Jun 24.
Recent years witnessed the development of powerful generative models based on flows, diffusion, or autoregressive neural networks, achieving remarkable success in generating data from examples with applications in a broad range of areas. A theoretical analysis of the performance and understanding of the limitations of these methods remain, however, challenging. In this paper, we undertake a step in this direction by analyzing the efficiency of sampling by these methods on a class of problems with a known probability distribution and comparing it with the sampling performance of more traditional methods such as the Monte Carlo Markov chain and Langevin dynamics. We focus on a class of probability distribution widely studied in the statistical physics of disordered systems that relate to spin glasses, statistical inference, and constraint satisfaction problems. We leverage the fact that sampling via flow-based, diffusion-based, or autoregressive networks methods can be equivalently mapped to the analysis of a Bayes optimal denoising of a modified probability measure. Our findings demonstrate that these methods encounter difficulties in sampling stemming from the presence of a first-order phase transition along the algorithm's denoising path. Our conclusions go both ways: We identify regions of parameters where these methods are unable to sample efficiently, while that is possible using standard Monte Carlo or Langevin approaches. We also identify regions where the opposite happens: standard approaches are inefficient while the discussed generative methods work well.
近年来,基于流、扩散或自回归神经网络的强大生成模型得到了发展,在从示例生成数据方面取得了显著成功,并在广泛领域得到应用。然而,对这些方法的性能进行理论分析并理解其局限性仍然具有挑战性。在本文中,我们朝着这个方向迈出了一步,通过分析这些方法在一类具有已知概率分布的问题上的采样效率,并将其与更传统的方法(如蒙特卡罗马尔可夫链和朗之万动力学)的采样性能进行比较。我们关注一类在无序系统统计物理学中广泛研究的概率分布,这些分布与自旋玻璃、统计推断和约束满足问题相关。我们利用这样一个事实,即通过基于流、基于扩散或自回归网络的方法进行采样可以等效地映射到对修改后的概率测度的贝叶斯最优去噪分析。我们的研究结果表明,这些方法在采样时遇到困难,这源于算法去噪路径上存在一阶相变。我们的结论有两方面:我们确定了这些方法无法有效采样的参数区域,而使用标准蒙特卡罗或朗之万方法则是可行的。我们还确定了相反情况发生的区域:标准方法效率低下,而所讨论的生成方法效果良好。