Suppr超能文献

具有线性基学习器的增强位置和尺度模型的显著性检验。

Significance Tests for Boosted Location and Scale Models with Linear Base-Learners.

作者信息

Hepp Tobias, Schmid Matthias, Mayr Andreas

机构信息

Institut für medizinische Biometrie, Informatik und Epidemiologie, Medizinische Fakultät, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn, Germany.

Institut für Medizininformatik, Biometrie und Epidemiologie, Medizinische Fakultät, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.

出版信息

Int J Biostat. 2019 Apr 16;15(1):/j/ijb.2019.15.issue-1/ijb-2018-0110/ijb-2018-0110.xml. doi: 10.1515/ijb-2018-0110.

Abstract

Generalized additive models for location scale and shape (GAMLSS) offer very flexible solutions to a wide range of statistical analysis problems, but can be challenging in terms of proper model specification. This complex task can be simplified using regularization techniques such as gradient boosting algorithms, but the estimates derived from such models are shrunken towards zero and it is consequently not straightforward to calculate proper confidence intervals or test statistics. In this article, we propose two strategies to obtain p-values for linear effect estimates for Gaussian location and scale models based on permutation tests and a parametric bootstrap approach. These procedures can provide a solution for one of the remaining problems in the application of gradient boosting algorithms for distributional regression in biostatistical data analyses. Results from extensive simulations indicate that in low-dimensional data both suggested approaches are able to hold the type-I error threshold and provide reasonable test power comparable to the Wald-type test for maximum likelihood inference. In high-dimensional data, when gradient boosting is the only feasible inference for this model class, the power decreases but the type-I error is still under control. In addition, we demonstrate the application of both tests in an epidemiological study to analyse the impact of physical exercise on both average and the stability of the lung function of elderly people in Germany.

摘要

位置-尺度-形状广义相加模型(GAMLSS)为广泛的统计分析问题提供了非常灵活的解决方案,但在正确的模型设定方面可能具有挑战性。使用诸如梯度提升算法等正则化技术可以简化这项复杂的任务,但由此类模型得出的估计值会向零收缩,因此计算适当的置信区间或检验统计量并非易事。在本文中,我们提出了两种基于置换检验和参数自助法来获得高斯位置和尺度模型线性效应估计的p值的策略。这些程序可以为生物统计数据分析中分布回归的梯度提升算法应用中剩下的问题之一提供解决方案。大量模拟结果表明,在低维数据中,两种建议的方法都能够保持I型错误阈值,并提供与用于最大似然推断的 Wald 型检验相当的合理检验功效。在高维数据中,当梯度提升是该模型类唯一可行的推断方法时,功效会降低,但I型错误仍在可控范围内。此外,我们展示了这两种检验在一项流行病学研究中的应用,以分析体育锻炼对德国老年人肺功能均值和稳定性的影响。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验