Galvis Diana M, Bandyopadhyay Dipankar, Lachos Victor H
Departamento de Estatística, IMECC-UNICAMP, Campinas, São Paulo, Brazil.
Stat Med. 2014 Sep 20;33(21):3759-71. doi: 10.1002/sim.6179. Epub 2014 Apr 24.
Continuous (clustered) proportion data often arise in various domains of medicine and public health where the response variable of interest is a proportion (or percentage) quantifying disease status for the cluster units, ranging between zero and one. However, because of the presence of relatively disease-free as well as heavily diseased subjects in any study, the proportion values can lie in the interval [0,1]. While beta regression can be adapted to assess covariate effects in these situations, its versatility is often challenged because of the presence/excess of zeros and ones because the beta support lies in the interval (0,1). To circumvent this, we augment the probabilities of zero and one with the beta density, controlling for the clustering effect. Our approach is Bayesian with the ability to borrow information across various stages of the complex model hierarchy and produces a computationally convenient framework amenable to available freeware. The marginal likelihood is tractable and can be used to develop Bayesian case-deletion influence diagnostics based on q-divergence measures. Both simulation studies and application to a real dataset from a clinical periodontology study quantify the gain in model fit and parameter estimation over other ad hoc alternatives and provide quantitative insight into assessing the true covariate effects on the proportion responses.
连续(聚类)比例数据经常出现在医学和公共卫生的各个领域,其中感兴趣的响应变量是一个比例(或百分比),用于量化聚类单元的疾病状态,范围在0到1之间。然而,由于在任何研究中都存在相对无病以及患病严重的受试者,比例值可能落在区间[0,1]内。虽然贝塔回归可以用于评估这些情况下的协变量效应,但其通用性常常受到挑战,因为存在零值和一值过多的情况,因为贝塔分布的支持区间在(0,1)内。为了规避这一问题,我们用贝塔密度增加零值和一值的概率,同时控制聚类效应。我们的方法是贝叶斯方法,能够在复杂模型层次结构的各个阶段借用信息,并产生一个计算方便的框架,适用于现有的免费软件。边际似然易于处理,可用于基于q散度度量开发贝叶斯案例删除影响诊断。模拟研究和对临床牙周病学研究真实数据集的应用都量化了与其他临时替代方法相比,模型拟合和参数估计方面的改进,并为评估协变量对比例响应的真实影响提供了定量见解。