Suppr超能文献

高维广义线性模型的最优泊松子采样去相关得分

Optimal Poisson subsampling decorrelated score for high-dimensional generalized linear models.

作者信息

Shan Junhao, Wang Lei

机构信息

School of Statistics and Data Science, KLMDASR, LEBPS and LPMC, Nankai University, Tianjin, People's Republic of China.

出版信息

J Appl Stat. 2024 Feb 11;51(14):2719-2743. doi: 10.1080/02664763.2024.2315467. eCollection 2024.

Abstract

For high-dimensional generalized linear models (GLMs) with massive data, this paper investigates a unified optimal Poisson subsampling scheme to conduct estimation and inference for prespecified low-dimensional partition of the whole parameter. A Poisson subsampling decorrelated score function is proposed such that the adverse effect of the less accurate nuisance parameter estimation with slow convergence rate can be mitigated. The resultant Poisson subsample estimator is proved to enjoy consistency and asymptotic normality, and a more general optimal subsampling criterion including A- and L-optimality criteria is formulated to improve estimation efficiency. We also propose a two-step algorithm for implementation and discuss some practical issues. The satisfactory performance of our method is validated through simulation studies and a real dataset.

摘要

对于具有海量数据的高维广义线性模型(GLMs),本文研究了一种统一的最优泊松子采样方案,用于对整个参数的预先指定的低维划分进行估计和推断。提出了一种泊松子采样去相关得分函数,以减轻收敛速度较慢的不太准确的干扰参数估计的不利影响。结果表明,所得的泊松子样本估计器具有一致性和渐近正态性,并制定了一个更通用的最优子采样准则,包括A-最优和L-最优准则,以提高估计效率。我们还提出了一种两步算法用于实现,并讨论了一些实际问题。通过模拟研究和一个真实数据集验证了我们方法的良好性能。

相似文献

2
Optimal Subsampling for Large Sample Logistic Regression.大样本逻辑回归的最优子采样
J Am Stat Assoc. 2018;113(522):829-844. doi: 10.1080/01621459.2017.1292914. Epub 2018 Jun 6.
6
Subsampling based variable selection for generalized linear models.基于子采样的广义线性模型变量选择
Comput Stat Data Anal. 2023 Aug;184. doi: 10.1016/j.csda.2023.107740. Epub 2023 Mar 11.
7
: a fast subsampling algorithm for Cox model with distributed and massive survival data.
Int J Biostat. 2025 Feb 4;21(1):53-65. doi: 10.1515/ijb-2024-0042. eCollection 2025 May 1.

本文引用的文献

1
TEST OF SIGNIFICANCE FOR HIGH-DIMENSIONAL LONGITUDINAL DATA.高维纵向数据的显著性检验
Ann Stat. 2020 Oct;48(5):2622-2645. doi: 10.1214/19-aos1900. Epub 2020 Sep 19.
2
Optimal Subsampling for Large Sample Logistic Regression.大样本逻辑回归的最优子采样
J Am Stat Assoc. 2018;113(522):829-844. doi: 10.1080/01621459.2017.1292914. Epub 2018 Jun 6.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验