Tay J Kenneth, Tibshirani Robert
Department of Statistics, Stanford University, Stanford, California, USA.
Department of Biomedical Data Science, Stanford University, Stanford, California, USA.
Int Stat Rev. 2020 Dec;88(Suppl 1):S205-S224. doi: 10.1111/insr.12429. Epub 2020 Nov 22.
Sparse generalised additive models (GAMs) are an extension of sparse generalised linear models that allow a model's prediction to vary non-linearly with an input variable. This enables the data analyst build more accurate models, especially when the linearity assumption is known to be a poor approximation of reality. Motivated by reluctant interaction modelling, we propose a multi-stage algorithm, called , that can fit sparse GAMs at scale. It is guided by the principle that, if all else is equal, one should prefer a linear feature over a non-linear feature. Unlike existing methods for sparse GAMs, RGAM can be extended easily to binary, count and survival data. We demonstrate the method's effectiveness on real and simulated examples.
稀疏广义相加模型(GAMs)是稀疏广义线性模型的扩展,它允许模型的预测随输入变量非线性变化。这使数据分析师能够构建更准确的模型,尤其是当已知线性假设与现实相差甚远时。出于对勉强交互建模的考虑,我们提出了一种多阶段算法,称为RGAM,它可以大规模拟合稀疏GAMs。该算法遵循这样的原则:在其他条件相同的情况下,应优先选择线性特征而非非线性特征。与现有的稀疏GAMs方法不同,RGAM可以轻松扩展到二元、计数和生存数据。我们在真实和模拟示例上证明了该方法的有效性。