Shiratori Tomokaze, Takano Yuichi
Graduate School of Science and Technology, University of Tsukuba, Tsukuba, Ibaraki, Japan.
Institute of Systems and Information Engineering, University of Tsukuba, Tsukuba, Ibaraki, Japan.
PLoS One. 2024 Dec 23;19(12):e0315740. doi: 10.1371/journal.pone.0315740. eCollection 2024.
Sparse estimation of a Gaussian graphical model (GGM) is an important technique for making relationships between observed variables more interpretable. Various methods have been proposed for sparse GGM estimation, including the graphical lasso that uses the ℓ1 norm regularization term, and other methods that use nonconvex regularization terms. Most of these methods approximate the ℓ0 (pseudo) norm by more tractable functions; however, to estimate more accurate solutions, it is preferable to directly use the ℓ0 norm for counting the number of nonzero elements. To this end, we focus on sparse estimation of GGM with the cardinality constraint based on the ℓ0 norm. Specifically, we convert the cardinality constraint into an equivalent constraint based on the largest-K norm, and reformulate the resultant constrained optimization problem into an unconstrained penalty form with a DC (difference of convex functions) representation. To solve this problem efficiently, we design a DC algorithm in which the graphical lasso algorithm is repeatedly executed to solve convex optimization subproblems. Experimental results using two synthetic datasets show that our method achieves results that are comparable to or better than conventional methods for sparse GGM estimation. Our method is particularly advantageous for selecting true edges when cross-validation is used to determine the number of edges. Moreover, our DC algorithm converges within a practical time frame compared to the graphical lasso.
高斯图形模型(GGM)的稀疏估计是一种使观测变量之间的关系更易于解释的重要技术。已经提出了各种用于稀疏GGM估计的方法,包括使用ℓ1范数正则化项的图形套索法,以及使用非凸正则化项的其他方法。这些方法大多通过更易于处理的函数来近似ℓ0(伪)范数;然而,为了估计更准确的解,直接使用ℓ0范数来计算非零元素的数量会更好。为此,我们专注于基于ℓ0范数的具有基数约束的GGM稀疏估计。具体来说,我们将基数约束转换为基于最大K范数的等效约束,并将由此产生的约束优化问题重新表述为具有DC(凸函数之差)表示的无约束惩罚形式。为了有效地解决这个问题,我们设计了一种DC算法,其中反复执行图形套索算法来求解凸优化子问题。使用两个合成数据集的实验结果表明,我们的方法在稀疏GGM估计方面取得了与传统方法相当或更好的结果。当使用交叉验证来确定边的数量时,我们的方法在选择真实边方面特别有利。此外,与图形套索法相比,我们的DC算法在实际时间范围内收敛。