Vaughan Gregory, Aseltine Robert, Chen Kun, Yan Jun
Department of Mathematical Sciences, Bentley University, Waltham, Massachusetts, USA.
Center for Population Health, University of Connecticut Health Center, Farmington, Connecticut, USA.
Stat Med. 2020 Sep 30;39(22):2855-2868. doi: 10.1002/sim.8574. Epub 2020 Jul 27.
Model selection in the presence of interaction terms is challenging as the final model must maintain a hierarchy between main effects and interaction terms. This work presents two stagewise estimation approaches to appropriately select models with interaction terms that can utilize generalized estimating equations to model clustered data. The first proposed technique is a hierarchical lasso stagewise estimating equations approach, which is shown to directly correspond to the hierarchical lasso penalized regression. The second is a stagewise active set approach, which enforces the variable hierarchy by conforming the selection to a properly growing active set in each stagewise estimation step. The effectiveness in interaction selection and the superior computational efficiency of the proposed techniques are assessed in simulation studies. The new methods are applied to a study of hospitalization rates attributed to suicide attempts among 15 to 19 year old at the school district level in Connecticut.
在存在交互项的情况下进行模型选择具有挑战性,因为最终模型必须在主效应和交互项之间保持层次结构。这项工作提出了两种逐步估计方法,以适当地选择具有交互项的模型,这些模型可以利用广义估计方程对聚类数据进行建模。提出的第一种技术是分层套索逐步估计方程方法,它被证明与分层套索惩罚回归直接对应。第二种是逐步活动集方法,它通过在每个逐步估计步骤中将选择符合适当增长的活动集来强制变量层次结构。在模拟研究中评估了所提出技术在交互选择方面的有效性和卓越的计算效率。这些新方法应用于康涅狄格州学区层面15至19岁青少年自杀未遂导致的住院率研究。