Griesbach Colin, Säfken Benjamin, Waldmann Elisabeth
Department of Medical Informatics, Biometry and Epidemiology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
Chair of Statistics, Georg-August-Universität Göttingen, Göttingen, Germany.
Int J Biostat. 2021 Jan 13;17(2):317-329. doi: 10.1515/ijb-2020-0136.
Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current boosting approaches also offer methods accounting for random effects and thus enable prediction of mixed models for longitudinal and clustered data. However, these approaches include several flaws resulting in unbalanced effect selection with falsely induced shrinkage and a low convergence rate on the one hand and biased estimates of the random effects on the other hand. We therefore propose a new boosting algorithm which explicitly accounts for the random structure by excluding it from the selection procedure, properly correcting the random effects estimates and in addition providing likelihood-based estimation of the random effects variance structure. The new algorithm offers an organic and unbiased fitting approach, which is shown via simulations and data examples.
统计学习领域的梯度提升作为一种强大的框架而广为人知,它通过采用分类理论中的概念,在各种回归模型中进行预测变量效应的估计和选择。当前的提升方法还提供了考虑随机效应的方法,从而能够对纵向和聚类数据的混合模型进行预测。然而,这些方法存在几个缺陷,一方面导致效应选择不平衡,出现错误诱导的收缩且收敛速度低,另一方面导致随机效应的估计有偏差。因此,我们提出一种新的提升算法,该算法通过在选择过程中排除随机结构来明确考虑随机结构,正确校正随机效应估计,此外还提供基于似然的随机效应方差结构估计。新算法提供了一种有机且无偏的拟合方法,这通过模拟和数据示例得到了证明。