Xu Yao Lei, Konstantinidis Kriton, Mandic Danilo P
Department of Electrical and Electronic Engineering, Imperial College London, London SW7 2AZ, U.K.
Neural Comput. 2023 Jun 20:1-26. doi: 10.1162/neco_a_01598.
Modern data analytics applications are increasingly characterized by exceedingly large and multidimensional data sources. This represents a challenge for traditional machine learning models, as the number of model parameters needed to process such data grows exponentially with the data dimensions, an effect known as the curse of dimensionality. Recently, tensor decomposition (TD) techniques have shown promising results in reducing the computational costs associated with large-dimensional models while achieving comparable performance. However, such tensor models are often unable to incorporate the underlying domain knowledge when compressing high-dimensional models. To this end, we introduce a novel graph-regularized tensor regression (GRTR) framework, whereby domain knowledge about intramodal relations is incorporated into the model in the form of a graph Laplacian matrix. This is then used as a regularization tool to promote a physically meaningful structure within the model parameters. By virtue of tensor algebra, the proposed framework is shown to be fully interpretable, both coefficient-wise and dimension-wise. The GRTR model is validated in a multiway regression setting and compared against competing models and is shown to achieve improved performance at reduced computational costs. Detailed visualizations are provided to help readers gain an intuitive understanding of the employed tensor operations.
现代数据分析应用越来越多地以超大规模和多维度的数据源为特征。这对传统机器学习模型构成了挑战,因为处理此类数据所需的模型参数数量会随着数据维度呈指数增长,这种效应被称为维度诅咒。最近,张量分解(TD)技术在降低与高维模型相关的计算成本同时实现可比性能方面显示出了有前景的结果。然而,此类张量模型在压缩高维模型时往往无法纳入潜在的领域知识。为此,我们引入了一种新颖的图正则化张量回归(GRTR)框架,通过该框架,关于模态内关系的领域知识以图拉普拉斯矩阵的形式被纳入模型。然后将其用作正则化工具,以促进模型参数内部具有物理意义的结构。借助张量代数,所提出的框架在系数和维度方面都被证明是完全可解释的。GRTR模型在多元回归设置中得到验证,并与竞争模型进行比较,结果表明它在降低计算成本的情况下实现了性能提升。提供了详细的可视化,以帮助读者直观理解所采用的张量运算。