Salzo Saverio, Masecchia Salvatore, Verri Alessandro, Barla Annalisa
DIMA, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genoa, Italy
Neural Comput. 2014 Dec;26(12):2855-95. doi: 10.1162/NECO_a_00672. Epub 2014 Sep 23.
We present an algorithm for dictionary learning that is based on the alternating proximal algorithm studied by Attouch, Bolte, Redont, and Soubeyran (2010), coupled with a reliable and efficient dual algorithm for computation of the related proximity operators. This algorithm is suitable for a general dictionary learning model composed of a Bregman-type data fit term that accounts for the goodness of the representation and several convex penalization terms on the coefficients and atoms, explaining the prior knowledge at hand. As Attouch et al. recently proved, an alternating proximal scheme ensures better convergence properties than the simpler alternating minimization. We take care of the issue of inexactness in the computation of the involved proximity operators, giving a sound stopping criterion for the dual inner algorithm, which keeps under control the related errors, unavoidable for such a complex penalty terms, providing ultimately an overall effective procedure. Thanks to the generality of the proposed framework, we give an application in the context of genome-wide data understanding, revising the model proposed by Nowak, Hastie, Pollack, and Tibshirani (2011). The aim is to extract latent features (atoms) and perform segmentation on array-based comparative genomic hybridization (aCGH) data. We improve several important aspects that increase the quality and interpretability of the results. We show the effectiveness of the proposed model with two experiments on synthetic data, which highlight the enhancements over the original model.
我们提出了一种基于Attouch、Bolte、Redont和Soubeyran(2010年)所研究的交替近端算法的字典学习算法,并结合了一种可靠且高效的对偶算法来计算相关的邻近算子。该算法适用于由一个Bregman型数据拟合项(用于说明表示的优良性)以及几个关于系数和原子的凸惩罚项组成的一般字典学习模型,这些惩罚项解释了手头的先验知识。正如Attouch等人最近所证明的,交替近端方案比更简单的交替最小化具有更好的收敛特性。我们处理了所涉及的邻近算子计算中的不精确性问题,为对偶内算法给出了合理的停止准则,该准则能控制相关误差,对于如此复杂的惩罚项而言,这些误差是不可避免的,最终提供了一个总体有效的程序。由于所提出框架的通用性,我们在全基因组数据理解的背景下给出了一个应用,改进了Nowak、Hastie、Pollack和Tibshirani((2011年)提出的模型。目的是提取潜在特征(原子)并对基于阵列的比较基因组杂交(aCGH)数据进行分割。我们改进了几个重要方面,提高了结果的质量和可解释性。我们通过对合成数据的两个实验展示了所提出模型的有效性,这些实验突出了相对于原始模型的改进。