Carrizosa Emilio, Olivares-Nadal Alba V, Ramírez-Cobo Pepa
Biostatistics. 2017 Apr 1;18(2):244-259. doi: 10.1093/biostatistics/kxw042.
Vector autoregressive (VAR) models constitute a powerful and well studied tool to analyze multivariate time series. Since sparseness, crucial to identify and visualize joint dependencies and relevant causalities, is not expected to happen in the standard VAR model, several sparse variants have been introduced in the literature. However, in some cases it might be of interest to control some dimensions of the sparsity, as e.g. the number of causal features allowed in the prediction. To authors extent none of the existent methods endows the user with full control over the different aspects of the sparsity of the solution. In this article, we propose a versatile sparsity-controlled VAR model which enables a proper visualization of potential causalities while allows the user to control different dimensions of the sparsity if she holds some preferences regarding the sparsity of the outcome. The model coefficients are found as the solution to an optimization problem, solvable by standard numerical optimization routines. The tests performed on both simulated and real-life time series show that our approach may outperform a greedy algorithm and different Lasso approaches in terms of prediction errors and sparsity.
向量自回归(VAR)模型是分析多元时间序列的一种强大且经过充分研究的工具。由于稀疏性对于识别和可视化联合依赖性及相关因果关系至关重要,但在标准VAR模型中不太可能出现,因此文献中引入了几种稀疏变体。然而,在某些情况下,控制稀疏性的某些维度可能会很有意义,例如预测中允许的因果特征数量。就作者所知,现有的方法都没有赋予用户对解的稀疏性不同方面的完全控制权。在本文中,我们提出了一种通用的稀疏性控制VAR模型,它能够对潜在因果关系进行恰当的可视化,同时如果用户对结果的稀疏性有一些偏好,还允许用户控制稀疏性的不同维度。模型系数通过求解一个优化问题得到,可由标准数值优化程序求解。对模拟和实际时间序列进行的测试表明,我们的方法在预测误差和稀疏性方面可能优于贪婪算法和不同的套索方法。