Giordani Paolo, Rocci Roberto
Dipartimento di Scienze Statistiche, Sapienza Università di Roma, P.le A. Moro, 5, 00185, Rome, Italy,
Psychometrika. 2013 Oct;78(4):669-84. doi: 10.1007/s11336-013-9321-9. Epub 2013 Feb 7.
The Candecomp/Parafac (CP) model is a well-known tool for summarizing a three-way array by extracting a limited number of components. Unfortunately, in some cases, the model suffers from the so-called degeneracy, that is a solution with diverging and uninterpretable components. To avoid degeneracy, orthogonality constraints are usually applied to one of the component matrices. This solves the problem only from a technical point of view because the existence of orthogonal components underlying the data is not guaranteed. For this purpose, we consider some variants of the CP model where the orthogonality constraints are relaxed either by constraining only a pair, or a subset, of components or by stimulating the CP solution to be possibly orthogonal. We theoretically clarify that only the latter approach, based on the least absolute shrinkage and selection operator and named the CP-Lasso, is helpful in solving the degeneracy problem. The results of the application of CP-Lasso on simulated and real life data show its effectiveness.
Candecomp/Parafac(CP)模型是一种通过提取有限数量的成分来总结三维数组的著名工具。不幸的是,在某些情况下,该模型会出现所谓的退化问题,即得到的成分发散且无法解释。为了避免退化,通常会对其中一个成分矩阵施加正交性约束。这只是从技术角度解决了问题,因为数据背后存在正交成分并不能得到保证。为此,我们考虑了CP模型的一些变体,其中通过仅约束一对或一组成分,或者通过促使CP解可能正交来放宽正交性约束。我们从理论上阐明,只有基于最小绝对收缩和选择算子的后一种方法(称为CP-Lasso)有助于解决退化问题。CP-Lasso在模拟数据和实际数据上的应用结果表明了其有效性。