Heiling Hillary M, Rashid Naim U, Li Quefeng, Ibrahim Joseph G
University of North Carolina Chapel Hill.
R J. 2023 Dec;15(4):106-128. doi: 10.32614/rj-2023-086. Epub 2024 Apr 10.
Generalized linear mixed models (GLMMs) are widely used in research for their ability to model correlated outcomes with non-Gaussian conditional distributions. The proper selection of fixed and random effects is a critical part of the modeling process, where model misspecification may lead to significant bias. However, the joint selection of fixed and random effects has historically been limited to lower dimensional GLMMs, largely due to the use of criterion-based model selection strategies. Here we present the R package glmmPen, one of the first to select fixed and random effects in higher dimension using a penalized GLMM modeling framework. Model parameters are estimated using a Monte Carlo expectation conditional minimization (MCECM) algorithm, which leverages Stan and RcppArmadillo for increased computational efficiency. Our package supports the Binomial, Gaussian, and Poisson families and multiple penalty functions. In this manuscript we discuss the modeling procedure, estimation scheme, and software implementation through application to a pancreatic cancer subtyping study. Simulation results show our method has good performance in selecting both the fixed and random effects in high dimensional GLMMs.
广义线性混合模型(GLMMs)因其能够对具有非高斯条件分布的相关结果进行建模而在研究中被广泛使用。固定效应和随机效应的正确选择是建模过程的关键部分,模型设定错误可能会导致显著偏差。然而,由于基于准则的模型选择策略的使用,固定效应和随机效应的联合选择在历史上一直局限于低维GLMMs。在这里,我们展示了R包glmmPen,它是最早使用惩罚GLMM建模框架在高维中选择固定效应和随机效应的软件包之一。模型参数使用蒙特卡罗期望条件最小化(MCECM)算法进行估计,该算法利用Stan和RcppArmadillo提高计算效率。我们的软件包支持二项分布、高斯分布和泊松分布族以及多种惩罚函数。在本手稿中,我们通过应用于胰腺癌亚型研究来讨论建模过程、估计方案和软件实现。模拟结果表明,我们的方法在高维GLMMs中选择固定效应和随机效应方面具有良好的性能。