Haris Asad, Witten Daniela, Simon Noah
Department of Biostatistics, University of Washington.
Departments of Statistics and Biostatistics, University of Washington.
J Comput Graph Stat. 2016;25(4):981-1004. doi: 10.1080/10618600.2015.1067217. Epub 2015 Aug 12.
We consider the task of fitting a regression model involving interactions among a potentially large set of covariates, in which we wish to enforce strong heredity. We propose FAMILY, a very general framework for this task. Our proposal is a generalization of several existing methods, such as VANISH [Radchenko and James, 2010], hierNet [Bien et al., 2013], the all-pairs lasso, and the lasso using only main effects. It can be formulated as the solution to a convex optimization problem, which we solve using an efficient alternating directions method of multipliers (ADMM) algorithm. This algorithm has guaranteed convergence to the global optimum, can be easily specialized to any convex penalty function of interest, and allows for a straightforward extension to the setting of generalized linear models. We derive an unbiased estimator of the degrees of freedom of FAMILY, and explore its performance in a simulation study and on an HIV sequence data set.
我们考虑拟合一个涉及大量潜在协变量之间相互作用的回归模型的任务,在此任务中我们希望强化遗传性。我们提出了FAMILY,这是针对此任务的一个非常通用的框架。我们的提议是对几种现有方法的推广,例如VANISH [拉德琴科和詹姆斯,2010年]、hierNet [比恩等人,2013年]、全对全套索法以及仅使用主效应的套索法。它可以被表述为一个凸优化问题的解,我们使用一种高效的交替方向乘子法(ADMM)算法来求解该问题。此算法保证收敛到全局最优解,能够轻松专门针对任何感兴趣的凸惩罚函数,并且允许直接扩展到广义线性模型的设定。我们推导了FAMILY自由度的无偏估计量,并在模拟研究和一个HIV序列数据集上探究了其性能。