Zhou Jie, Sun Will Wei, Zhang Jingfei, Li Lexin
Department of Management Science, University of Miami Herbert Business School, Miami, FL.
Krannert School of Management, Purdue University, West Lafayette, IN.
J Am Stat Assoc. 2023;118(541):424-439. doi: 10.1080/01621459.2021.1938082. Epub 2021 Jul 19.
In modern data science, dynamic tensor data prevail in numerous applications. An important task is to characterize the relationship between dynamic tensor datasets and external covariates. However, the tensor data are often only partially observed, rendering many existing methods inapplicable. In this article, we develop a regression model with a partially observed dynamic tensor as the response and external covariates as the predictor. We introduce the low-rankness, sparsity, and fusion structures on the regression coefficient tensor, and consider a loss function projected over the observed entries. We develop an efficient nonconvex alternating updating algorithm, and derive the finite-sample error bound of the actual estimator from each step of our optimization algorithm. Unobserved entries in the tensor response have imposed serious challenges. As a result, our proposal differs considerably in terms of estimation algorithm, regularity conditions, as well as theoretical properties, compared to the existing tensor completion or tensor response regression solutions. We illustrate the efficacy of our proposed method using simulations and two real applications, including a neuroimaging dementia study and a digital advertising study.
在现代数据科学中,动态张量数据在众多应用中广泛存在。一项重要任务是刻画动态张量数据集与外部协变量之间的关系。然而,张量数据往往只是部分可观测的,这使得许多现有方法无法适用。在本文中,我们开发了一种回归模型,以部分可观测的动态张量作为响应变量,外部协变量作为预测变量。我们在回归系数张量上引入低秩性、稀疏性和融合结构,并考虑在观测值上投影的损失函数。我们开发了一种高效的非凸交替更新算法,并从优化算法的每一步推导实际估计量的有限样本误差界。张量响应中的未观测值带来了严峻挑战。因此,与现有的张量补全或张量响应回归解决方案相比,我们的提议在估计算法、正则条件以及理论性质方面有很大不同。我们通过模拟和两个实际应用展示了所提方法的有效性,其中包括一项神经影像痴呆症研究和一项数字广告研究。