Institute for Medical Information Sciences, Biometry and Epidemiology (IBE), Ludwig-Maximilians University, 81377 Munich, Marchioninistr, 15, Germany.
BMC Med Res Methodol. 2012 Sep 10;12:137. doi: 10.1186/1471-2288-12-137.
A statistical analysis plan (SAP) is a critical link between how a clinical trial is conducted and the clinical study report. To secure objective study results, regulatory bodies expect that the SAP will meet requirements in pre-specifying inferential analyses and other important statistical techniques. To write a good SAP for model-based sensitivity and ancillary analyses involves non-trivial decisions on and justification of many aspects of the chosen setting. In particular, trials with longitudinal count data as primary endpoints pose challenges for model choice and model validation. In the random effects setting, frequentist strategies for model assessment and model diagnosis are complex and not easily implemented and have several limitations. Therefore, it is of interest to explore Bayesian alternatives which provide the needed decision support to finalize a SAP.
We focus on generalized linear mixed models (GLMMs) for the analysis of longitudinal count data. A series of distributions with over- and under-dispersion is considered. Additionally, the structure of the variance components is modified. We perform a simulation study to investigate the discriminatory power of Bayesian tools for model criticism in different scenarios derived from the model setting. We apply the findings to the data from an open clinical trial on vertigo attacks. These data are seen as pilot data for an ongoing phase III trial. To fit GLMMs we use a novel Bayesian computational approach based on integrated nested Laplace approximations (INLAs). The INLA methodology enables the direct computation of leave-one-out predictive distributions. These distributions are crucial for Bayesian model assessment. We evaluate competing GLMMs for longitudinal count data according to the deviance information criterion (DIC) or probability integral transform (PIT), and by using proper scoring rules (e.g. the logarithmic score).
The instruments under study provide excellent tools for preparing decisions within the SAP in a transparent way when structuring the primary analysis, sensitivity or ancillary analyses, and specific analyses for secondary endpoints. The mean logarithmic score and DIC discriminate well between different model scenarios. It becomes obvious that the naive choice of a conventional random effects Poisson model is often inappropriate for real-life count data. The findings are used to specify an appropriate mixed model employed in the sensitivity analyses of an ongoing phase III trial.
The proposed Bayesian methods are not only appealing for inference but notably provide a sophisticated insight into different aspects of model performance, such as forecast verification or calibration checks, and can be applied within the model selection process. The mean of the logarithmic score is a robust tool for model ranking and is not sensitive to sample size. Therefore, these Bayesian model selection techniques offer helpful decision support for shaping sensitivity and ancillary analyses in a statistical analysis plan of a clinical trial with longitudinal count data as the primary endpoint.
统计分析计划(SAP)是临床试验实施方式与临床研究报告之间的重要环节。为确保研究结果客观,监管机构期望 SAP 能够满足预先规定推断分析和其他重要统计技术的要求。为基于模型的敏感性分析和辅助分析编写良好的 SAP 需要对所选择设置的许多方面进行非平凡的决策和论证。特别是,作为主要终点的纵向计数数据的临床试验给模型选择和模型验证带来了挑战。在随机效应设置中,用于模型评估和模型诊断的频率策略很复杂,难以实现,并且存在一些局限性。因此,探索贝叶斯替代方法来为最终确定 SAP 提供所需的决策支持是很有意义的。
我们专注于用于分析纵向计数数据的广义线性混合模型(GLMM)。考虑了一系列具有过分散和欠分散的分布。此外,还修改了方差分量的结构。我们进行了一项模拟研究,以调查贝叶斯工具在不同源于模型设置的场景下对模型批评的判别能力。我们将研究结果应用于眩晕发作开放临床试验的数据。这些数据被视为正在进行的 III 期试验的试点数据。为了拟合 GLMM,我们使用了一种新的基于集成嵌套 Laplace 逼近(INLA)的贝叶斯计算方法。INLA 方法能够直接计算出留一预测分布。这些分布对于贝叶斯模型评估至关重要。我们根据偏差信息准则(DIC)或概率积分变换(PIT),以及使用适当的评分规则(例如对数评分),对用于纵向计数数据的竞争 GLMM 进行评估。
研究中的工具为在构建主要分析、敏感性或辅助分析以及次要终点的特定分析时,在 SAP 中以透明的方式进行决策提供了极好的工具。平均对数评分和 DIC 可以很好地区分不同的模型场景。显然,对现实生活中的计数数据来说,传统的随机效应泊松模型的盲目选择通常是不合适的。研究结果用于指定正在进行的 III 期试验的敏感性分析中使用的适当混合模型。
所提出的贝叶斯方法不仅吸引人的是推理,而且特别深入地了解模型性能的各个方面,例如预测验证或校准检查,并且可以应用于模型选择过程中。对数评分的平均值是一种稳健的模型排名工具,对样本量不敏感。因此,这些贝叶斯模型选择技术为具有纵向计数数据作为主要终点的临床试验的统计分析计划中的敏感性分析和辅助分析提供了有帮助的决策支持。