Gatsby Unit, University College London, London WC1N 3AR, U.K.
Neural Comput. 2012 Jun;24(6):1391-407. doi: 10.1162/NECO_a_00283. Epub 2012 Feb 24.
The concave-convex procedure (CCCP) is an iterative algorithm that solves d.c. (difference of convex functions) programs as a sequence of convex programs. In machine learning, CCCP is extensively used in many learning algorithms, including sparse support vector machines (SVMs), transductive SVMs, and sparse principal component analysis. Though CCCP is widely used in many applications, its convergence behavior has not gotten a lot of specific attention. Yuille and Rangarajan analyzed its convergence in their original paper; however, we believe the analysis is not complete. The convergence of CCCP can be derived from the convergence of the d.c. algorithm (DCA), proposed in the global optimization literature to solve general d.c. programs, whose proof relies on d.c. duality. In this note, we follow a different reasoning and show how Zangwill's global convergence theory of iterative algorithms provides a natural framework to prove the convergence of CCCP. This underlines Zangwill's theory as a powerful and general framework to deal with the convergence issues of iterative algorithms, after also being used to prove the convergence of algorithms like expectation-maximization and generalized alternating minimization. In this note, we provide a rigorous analysis of the convergence of CCCP by addressing two questions: When does CCCP find a local minimum or a stationary point of the d.c. program under consideration? and when does the sequence generated by CCCP converge? We also present an open problem on the issue of local convergence of CCCP.
凹凸过程(CCCP)是一种迭代算法,可将凸函数的差(d.c.)程序作为一系列凸程序来求解。在机器学习中,CCCP 在许多学习算法中得到了广泛的应用,包括稀疏支持向量机(SVM)、转导 SVM 和稀疏主成分分析。尽管 CCCP 在许多应用中得到了广泛的应用,但它的收敛行为并没有得到太多的关注。Yuille 和 Rangarajan 在他们的原始论文中分析了它的收敛性;然而,我们认为分析并不完整。CCCP 的收敛性可以从 d.c.算法(DCA)的收敛性中推导出来,该算法在全局优化文献中被提出,用于解决一般的 d.c.程序,其证明依赖于 d.c.对偶性。在本注记中,我们遵循不同的推理,并展示 Zangwill 的迭代算法全局收敛理论如何为证明 CCCP 的收敛性提供了一个自然的框架。这突显了 Zangwill 的理论作为一种强大而通用的框架,用于处理迭代算法的收敛问题,在证明像期望最大化和广义交替最小化等算法的收敛性之后。在本注记中,我们通过回答两个问题来对 CCCP 的收敛性进行严格的分析:CCCP 在考虑的 d.c.程序下何时找到局部极小值或驻点?以及 CCCP 生成的序列何时收敛?我们还提出了一个关于 CCCP 局部收敛性的开放性问题。