Lim Michael, Hastie Trevor
Statistics Department, Stanford University.
J Comput Graph Stat. 2015;24(3):627-654. doi: 10.1080/10618600.2014.938812. Epub 2015 Sep 16.
We introduce a method for learning pairwise interactions in a linear regression or logistic regression model in a manner that satisfies strong hierarchy: whenever an interaction is estimated to be nonzero, both its associated main effects are also included in the model. We motivate our approach by modeling pairwise interactions for categorical variables with arbitrary numbers of levels, and then show how we can accommodate continuous variables as well. Our approach allows us to dispense with explicitly applying constraints on the main effects and interactions for identifiability, which results in interpretable interaction models. We compare our method with existing approaches on both simulated and real data, including a genome-wide association study, all using our R package glinternet.
我们介绍了一种在线性回归或逻辑回归模型中学习成对交互作用的方法,该方法满足强层次结构:只要估计某个交互作用不为零,其相关的两个主效应也会包含在模型中。我们通过对具有任意多个水平的分类变量的成对交互作用进行建模来推动我们的方法,然后展示如何也能纳入连续变量。我们的方法使我们无需为可识别性而对主效应和交互作用明确施加约束,从而得到可解释的交互作用模型。我们使用我们的R包glinternet,在模拟数据和真实数据(包括全基因组关联研究)上,将我们的方法与现有方法进行了比较。