Suppr超能文献

glmmPen:高维惩罚广义线性混合模型

glmmPen: High Dimensional Penalized Generalized Linear Mixed Models.

作者信息

Heiling Hillary M, Rashid Naim U, Li Quefeng, Ibrahim Joseph G

机构信息

University of North Carolina Chapel Hill.

出版信息

R J. 2023 Dec;15(4):106-128. doi: 10.32614/rj-2023-086. Epub 2024 Apr 10.

Abstract

Generalized linear mixed models (GLMMs) are widely used in research for their ability to model correlated outcomes with non-Gaussian conditional distributions. The proper selection of fixed and random effects is a critical part of the modeling process, where model misspecification may lead to significant bias. However, the joint selection of fixed and random effects has historically been limited to lower dimensional GLMMs, largely due to the use of criterion-based model selection strategies. Here we present the R package glmmPen, one of the first to select fixed and random effects in higher dimension using a penalized GLMM modeling framework. Model parameters are estimated using a Monte Carlo expectation conditional minimization (MCECM) algorithm, which leverages Stan and RcppArmadillo for increased computational efficiency. Our package supports the Binomial, Gaussian, and Poisson families and multiple penalty functions. In this manuscript we discuss the modeling procedure, estimation scheme, and software implementation through application to a pancreatic cancer subtyping study. Simulation results show our method has good performance in selecting both the fixed and random effects in high dimensional GLMMs.

摘要

广义线性混合模型(GLMMs)因其能够对具有非高斯条件分布的相关结果进行建模而在研究中被广泛使用。固定效应和随机效应的正确选择是建模过程的关键部分,模型设定错误可能会导致显著偏差。然而,由于基于准则的模型选择策略的使用,固定效应和随机效应的联合选择在历史上一直局限于低维GLMMs。在这里,我们展示了R包glmmPen,它是最早使用惩罚GLMM建模框架在高维中选择固定效应和随机效应的软件包之一。模型参数使用蒙特卡罗期望条件最小化(MCECM)算法进行估计,该算法利用Stan和RcppArmadillo提高计算效率。我们的软件包支持二项分布、高斯分布和泊松分布族以及多种惩罚函数。在本手稿中,我们通过应用于胰腺癌亚型研究来讨论建模过程、估计方案和软件实现。模拟结果表明,我们的方法在高维GLMMs中选择固定效应和随机效应方面具有良好的性能。

相似文献

8
Incentives for preventing smoking in children and adolescents.预防儿童和青少年吸烟的激励措施。
Cochrane Database Syst Rev. 2017 Jun 6;6(6):CD008645. doi: 10.1002/14651858.CD008645.pub3.

本文引用的文献

1
Stan: A Probabilistic Programming Language.斯坦:一种概率编程语言。
J Stat Softw. 2017;76. doi: 10.18637/jss.v076.i01. Epub 2017 Jan 11.
5
Training replicable predictors in multiple studies.在多项研究中训练可复制的预测因子。
Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2578-2583. doi: 10.1073/pnas.1708283115. Epub 2018 Mar 12.
9
VARIABLE SELECTION IN LINEAR MIXED EFFECTS MODELS.线性混合效应模型中的变量选择
Ann Stat. 2012 Aug 1;40(4):2043-2068. doi: 10.1214/12-AOS1028.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验