更好的柠檬榨汁器？具有贝塔分布因变量的最大似然回归。

A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables.

作者信息

Smithson Michael, Verkuilen Jay

机构信息

The Australian National University, Canberra, Australian Capital Terrotory, Australia.

出版信息

Psychol Methods. 2006 Mar;11(1):54-71. doi: 10.1037/1082-989X.11.1.54.

DOI:10.1037/1082-989X.11.1.54

PMID:16594767

Abstract

Uncorrectable skew and heteroscedasticity are among the "lemons" of psychological data, yet many important variables naturally exhibit these properties. For scales with a lower and upper bound, a suitable candidate for models is the beta distribution, which is very flexible and models skew quite well. The authors present maximum-likelihood regression models assuming that the dependent variable is conditionally beta distributed rather than Gaussian. The approach models both means (location) and variances (dispersion) with their own distinct sets of predictors (continuous and/or categorical), thereby modeling heteroscedasticity. The location sub-model link function is the logit and thereby analogous to logistic regression, whereas the dispersion sub-model is log linear. Real examples show that these models handle the independent observations case readily. The article discusses comparisons between beta regression and alternative techniques, model selection and interpretation, practical estimation, and software.

摘要

不可校正的偏态和异方差性是心理数据中的“次品”，然而许多重要变量自然会呈现出这些特性。对于有上下界的量表，一种适合模型的候选分布是贝塔分布，它非常灵活且能很好地模拟偏态。作者提出了最大似然回归模型，假设因变量是条件贝塔分布而非高斯分布。该方法用各自不同的预测变量集（连续和/或分类变量）对均值（位置）和方差（离散度）进行建模，从而对异方差性进行建模。位置子模型的链接函数是对数几率函数，因此类似于逻辑回归，而离散度子模型是对数线性的。实际例子表明，这些模型能轻松处理独立观测值的情况。本文讨论了贝塔回归与其他技术的比较、模型选择与解释、实际估计以及软件相关内容。