Javanmard Adel, Montanari Andrea, Ricci-Tersenghi Federico
Marshall School of Business, University of Southern California, Los Angeles, CA 90089;
Department of Statistics, Stanford University, Stanford, CA 94305; Department of Electrical Engineering, Stanford University, Stanford, CA 94305;
Proc Natl Acad Sci U S A. 2016 Apr 19;113(16):E2218-23. doi: 10.1073/pnas.1523097113. Epub 2016 Mar 21.
Statistical inference problems arising within signal processing, data mining, and machine learning naturally give rise to hard combinatorial optimization problems. These problems become intractable when the dimensionality of the data is large, as is often the case for modern datasets. A popular idea is to construct convex relaxations of these combinatorial problems, which can be solved efficiently for large-scale datasets. Semidefinite programming (SDP) relaxations are among the most powerful methods in this family and are surprisingly well suited for a broad range of problems where data take the form of matrices or graphs. It has been observed several times that when the statistical noise is small enough, SDP relaxations correctly detect the underlying combinatorial structures. In this paper we develop asymptotic predictions for several detection thresholds, as well as for the estimation error above these thresholds. We study some classical SDP relaxations for statistical problems motivated by graph synchronization and community detection in networks. We map these optimization problems to statistical mechanics models with vector spins and use nonrigorous techniques from statistical mechanics to characterize the corresponding phase transitions. Our results clarify the effectiveness of SDP relaxations in solving high-dimensional statistical problems.
信号处理、数据挖掘和机器学习中出现的统计推断问题自然会引发棘手的组合优化问题。当数据维度很大时,这些问题就变得难以处理,现代数据集往往就是这种情况。一个流行的想法是构建这些组合问题的凸松弛,对于大规模数据集,这种凸松弛可以有效求解。半定规划(SDP)松弛是这类方法中最强大的方法之一,并且令人惊讶地非常适合数据采用矩阵或图形式的广泛问题。多次观察到,当统计噪声足够小时,SDP松弛能正确检测到潜在的组合结构。在本文中,我们针对几个检测阈值以及这些阈值之上的估计误差给出渐近预测。我们研究一些受网络中的图同步和社区检测启发的统计问题的经典SDP松弛。我们将这些优化问题映射到具有向量自旋的统计力学模型,并使用统计力学中的非严格技术来刻画相应的相变。我们的结果阐明了SDP松弛在解决高维统计问题中的有效性。