Tank Alex, Li Xiudi, Fox Emily B, Shojaie Ali
The Voleon Group, Berkeley, CA.
Department of Biostatistics, University of Washington, Seattle WA.
SIAM J Math Data Sci. 2021;3(1):83-112. doi: 10.1137/20m133097x.
We present a framework for learning Granger causality networks for multivariate categorical time series based on the mixture transition distribution (MTD) model. Traditionally, MTD is plagued by a nonconvex objective, non-identifiability, and presence of local optima. To circumvent these problems, we recast inference in the MTD as a convex problem. The new formulation facilitates the application of MTD to high-dimensional multivariate time series. As a baseline, we also formulate a multi-output logistic autoregressive model (mLTD), which while a straightforward extension of autoregressive Bernoulli generalized linear models, has not been previously applied to the analysis of multivariate categorial time series. We establish identifiability conditions of the MTD model and compare them to those for mLTD. We further devise novel and efficient optimization algorithms for MTD based on our proposed convex formulation, and compare the MTD and mLTD in both simulated and real data experiments. Finally, we establish consistency of the convex MTD in high dimensions. Our approach simultaneously provides a comparison of methods for network inference in categorical time series and opens the door to modern, regularized inference with the MTD model.
我们提出了一个基于混合转移分布(MTD)模型学习多元分类时间序列的格兰杰因果网络的框架。传统上,MTD存在非凸目标、不可识别性和局部最优等问题。为了规避这些问题,我们将MTD中的推断重新表述为一个凸问题。新的公式便于将MTD应用于高维多元时间序列。作为基线,我们还构建了一个多输出逻辑自回归模型(mLTD),它虽然是自回归伯努利广义线性模型的直接扩展,但此前尚未应用于多元分类时间序列的分析。我们建立了MTD模型的可识别性条件,并将其与mLTD的条件进行比较。我们基于提出的凸公式进一步设计了新颖且高效的MTD优化算法,并在模拟和真实数据实验中对MTD和mLTD进行比较。最后,我们建立了高维凸MTD的一致性。我们的方法同时提供了分类时间序列中网络推断方法的比较,并为使用MTD模型进行现代正则化推断打开了大门。