具有单调缺失数据的随机实验中随机系数回归模型的功效和样本量。

Power and sample size for random coefficient regression models in randomized experiments with monotone missing data.

机构信息

Department of Biostatistics, Genentech Inc., San Francisco, CA, USA.

Department of Neurosciences, University of California San Diego, San Diego, CA, USA.

出版信息

Biom J. 2021 Apr;63(4):806-824. doi: 10.1002/bimj.202000184. Epub 2021 Feb 15.

DOI:10.1002/bimj.202000184

PMID:33586212

Abstract

Random coefficient regression (also known as random effects, mixed effects, growth curve, variance component, multilevel, or hierarchical linear modeling) can be a natural and useful approach for characterizing and testing hypotheses in data that are correlated within experimental units. Existing power and sample size software for such data are based on two variance component models or those using a two-stage formulation. These approaches may be markedly inaccurate in settings where more variance components (i.e., intercept, rate of change, and residual error) are warranted. We present variance, power, sample size formulae, and software (R Shiny app) for use with random coefficient regression models with possible missing data and variable follow-up. We illustrate sample size and study design planning using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. We additionally examine the drivers of variability to better inform study design.

摘要

随机系数回归（也称为随机效应、混合效应、增长曲线、方差分量、多层次或层次线性建模）可以成为一种在实验单位内相关数据中描述和检验假设的自然而有用的方法。现有的此类数据的功效和样本量软件基于两个方差分量模型或使用两阶段公式的模型。在需要更多方差分量（即截距、变化率和残差误差）的情况下，这些方法可能会显著不准确。我们提出了用于具有可能缺失数据和可变随访的随机系数回归模型的方差、功效、样本量公式和软件（R Shiny 应用程序）。我们使用来自阿尔茨海默病神经影像学倡议 (ADNI) 数据库的数据来说明样本量和研究设计规划。我们还检查了可变性的驱动因素，以更好地为研究设计提供信息。