Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Institute of Health Economics and Health Care Management, Neuherberg, 85764, Germany.
BMC Med Res Methodol. 2012 Sep 17;12:144. doi: 10.1186/1471-2288-12-144.
Health-related quality of life (HRQL) has become an increasingly important outcome parameter in clinical trials and epidemiological research. HRQL scores are typically bounded at both ends of the scale and often highly skewed. Several regression techniques have been proposed to model such data in cross-sectional studies, however, methods applicable in longitudinal research are less well researched. This study examined the use of beta regression models for analyzing longitudinal HRQL data using two empirical examples with distributional features typically encountered in practice.
We used SF-6D utility data from a German older age cohort study and stroke-specific HRQL data from a randomized controlled trial. We described the conceptual differences between mixed and marginal beta regression models and compared both models to the commonly used linear mixed model in terms of overall fit and predictive accuracy.
At any measurement time, the beta distribution fitted the SF-6D utility data and stroke-specific HRQL data better than the normal distribution. The mixed beta model showed better likelihood-based fit statistics than the linear mixed model and respected the boundedness of the outcome variable. However, it tended to underestimate the true mean at the upper part of the distribution. Adjusted group means from marginal beta model and linear mixed model were nearly identical but differences could be observed with respect to standard errors.
Understanding the conceptual differences between mixed and marginal beta regression models is important for their proper use in the analysis of longitudinal HRQL data. Beta regression fits the typical distribution of HRQL data better than linear mixed models, however, if focus is on estimating group mean scores rather than making individual predictions, the two methods might not differ substantially.
健康相关生活质量(HRQL)已成为临床试验和流行病学研究中越来越重要的结果参数。HRQL 评分通常在量表的两端都有界限,并且经常高度偏态。已经提出了几种回归技术来对横断面研究中的此类数据进行建模,但是,适用于纵向研究的方法研究得较少。本研究使用两个具有实际中常见分布特征的实证示例,检验了贝塔回归模型在分析纵向 HRQL 数据中的应用。
我们使用了德国老年队列研究的 SF-6D 效用数据和随机对照试验的特定于中风的 HRQL 数据。我们描述了混合和边际贝塔回归模型之间的概念差异,并就总体拟合度和预测准确性而言,将这两种模型与常用的线性混合模型进行了比较。
在任何测量时间,贝塔分布都比正态分布更适合 SF-6D 效用数据和特定于中风的 HRQL 数据。混合贝塔模型的似然拟合统计量优于线性混合模型,并且尊重了因变量的有界性。但是,它倾向于低估分布上限的真实平均值。边际贝塔模型和线性混合模型的调整后组均值几乎相同,但在标准误差方面可以观察到差异。
了解混合和边际贝塔回归模型之间的概念差异对于正确分析纵向 HRQL 数据非常重要。贝塔回归比线性混合模型更适合 HRQL 数据的典型分布,但是,如果重点是估计组均分数而不是进行个体预测,则这两种方法可能没有太大区别。