Departments of Statistics and Electrical Engineering, Stanford University, Stanford, CA 94305, USA.
Proc Natl Acad Sci U S A. 2013 May 21;110(21):8405-10. doi: 10.1073/pnas.1306110110. Epub 2013 May 6.
Let X(0) be an unknown M by N matrix. In matrix recovery, one takes n < MN linear measurements y(1),…,y(n) of X(0), where y(i) = Tr(A(T)iX(0)) and each A(i) is an M by N matrix. A popular approach for matrix recovery is nuclear norm minimization (NNM): solving the convex optimization problem min ||X||subject to y(i) =Tr(A(T)(i)X) for all 1 ≤ i ≤ n, where || · ||* denotes the nuclear norm, namely, the sum of singular values. Empirical work reveals a phase transition curve, stated in terms of the undersampling fraction δ(n,M,N) = n/(MN), rank fraction ρ=rank(X0)/min {M,N}, and aspect ratio β=M/N. Specifically when the measurement matrices Ai have independent standard Gaussian random entries, a curve δ*(ρ) = δ*(ρ;β) exists such that, if δ > δ(ρ), NNM typically succeeds for large M,N, whereas if δ < δ*(ρ), it typically fails. An apparently quite different problem is matrix denoising in Gaussian noise, in which an unknown M by N matrix X(0) is to be estimated based on direct noisy measurements Y =X(0) + Z, where the matrix Z has independent and identically distributed Gaussian entries. A popular matrix denoising scheme solves the unconstrained optimization problem min|| Y-X||(2)(F)/2+λ||X||. When optimally tuned, this scheme achieves the asymptotic minimax mean-squared error M(ρ;β) = lim(M,N → ∞)inf(λ)sup(rank(X) ≤ ρ · M)MSE(X,X(λ)), where M/N → . We report extensive experiments showing that the phase transition δ(ρ) in the first problem, matrix recovery from Gaussian measurements, coincides with the minimax risk curve M(ρ)=M(ρ;β) in the second problem, matrix denoising in Gaussian noise: δ*(ρ)=M(ρ), for any rank fraction 0 < ρ < 1 (at each common aspect ratio β). Our experiments considered matrices belonging to two constraint classes: real M by N matrices, of various ranks and aspect ratios, and real symmetric positive-semidefinite N by N matrices, of various ranks.
令$X(0)$为一个未知的$M\times N$矩阵。在矩阵恢复中,我们取$n<MN$个线性测量值$y(1),\cdots,y(n)$的$X(0)$,其中$y(i)=\operatorname{Tr}(A^\top(i)X(0))$,且每个$A(i)$都是一个$M\times N$矩阵。矩阵恢复的一种流行方法是核范数最小化(NNM):求解凸优化问题
$$
\min\limits_{X}|X|_*\
s.t.\ y(i)=\operatorname{Tr}(A^\top(i)X)\quad\forall1\leq i\leq n
$$
其中$| \cdot |_$表示核范数,即奇异值的和。经验工作揭示了一条相转变曲线,以欠采样分数$\delta(n,M,N)=n/(MN)$、秩分数$\rho=\operatorname{rank}(X_0)/\min{M,N}$和纵横比$\beta=M/N$来表示。具体来说,当测量矩阵$A_i$具有独立的标准高斯随机分量时,存在一条曲线$\delta^(ρ)=\delta^(ρ;β)$,使得如果$\delta>\delta^(ρ)$,则 NNM 通常在$M,N$很大时成功,而如果$\delta<\delta^*(ρ)$,则通常失败。一个明显不同的问题是在高斯噪声中的矩阵去噪,其中一个未知的$M\times N$矩阵$X(0)$要基于直接的噪声测量值$Y=X(0)+Z$来估计,其中矩阵$Z$具有独立同分布的高斯分量。一种流行的矩阵去噪方案是求解无约束优化问题
$$
\min\limits_{X}|Y-X|2^2/2+\lambda|X|*\
$$
当最优调谐时,该方案实现了渐近最小均方误差风险曲线$M(ρ;β)=\lim\limits_{(M,N)\to\infty}\inf(λ)\sup(rank(X)\leq ρ\cdot M)\text{MSE}(X,X(λ))$,其中$M/N\to0$。我们报告了广泛的实验结果,表明第一个问题中高斯测量值的矩阵恢复的相转变曲线$\delta^(ρ)$与第二个问题中高斯噪声中的矩阵去噪的最小风险曲线$M(ρ)=M(ρ;β)$一致:$\delta^(ρ)=M(ρ)$,对于任何$0<ρ<1$的秩分数(在每个共同纵横比$\beta$下)。我们的实验考虑了属于两个约束类别的矩阵:各种秩和纵横比的实$M\times N$矩阵,以及各种秩的实对称正定半正定$N\times N$矩阵。