Lewis Bradley R, Bandyopadhyay Dipankar, DeSantis Stacia M, John Mike T
Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA.
Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, USA.
J Appl Probab Stat. 2017 May;12(1):49-66.
Often in clinical dental research, clinical attachment level (CAL) is recorded at several sites throughout the mouth to assess the extent of periodontal disease (PD). One might be interested to quantify PD at the tooth-level via the proportion of diseased sites per tooth type (say, incisors, canines, pre-molars and molars) per subject. However, these studies might consist of relatively disease-free and highly diseased subjects leading to the proportion responses distributed in the interval [0, 1]. While beta regression (BR) is often the model of choice to assess covariate effects for proportion data, the presence (and/or abundance) of zeros and/or ones makes it inapplicable here because the beta support is defined in the interval (0, 1). Avoiding ad hoc data transformation, we explore the potential of the augmented BR framework which augments the beta density with non-zero masses at zero and one while accounting for the clustering induced. Our classical estimation framework using maximum likelihood utilizes the potential of the SAS® Proc NLMIXED procedure. We explore our methodology via simulation studies and application to a real cross-sectional dataset on PD, and we assess the gain in model fit and parameter estimation over other ad hoc alternatives. This reveals newer insights into risk quantification on clustered proportion responses. Our methods can be implemented using standard SAS software routines. The augmented BR model results in a better fit to clustered periodontal proportion data over the standard beta model. We recommend using it as a parametric alternative for fitting proportion data, and avoid ad hoc data transformation.
在临床牙科研究中,常常在口腔内的多个部位记录临床附着水平(CAL),以评估牙周疾病(PD)的程度。人们可能有兴趣通过计算每个受试者每种牙齿类型(如切牙、尖牙、前磨牙和磨牙)患病部位的比例,在牙齿层面量化牙周疾病。然而,这些研究可能包括相对无病和患病严重的受试者,导致比例反应分布在区间[0, 1]内。虽然贝塔回归(BR)通常是评估比例数据协变量效应的首选模型,但零值和/或一值的存在(和/或丰度)使其在此处不适用,因为贝塔分布的支持区间定义在(0, 1)内。在不进行特殊数据转换的情况下,我们探索增强型贝塔回归框架的潜力,该框架在零值和一值处增加非零质量来增强贝塔密度,同时考虑到由此产生的聚类效应。我们使用最大似然法的经典估计框架利用了SAS® Proc NLMIXED过程的潜力。我们通过模拟研究和将其应用于一个关于牙周疾病的实际横断面数据集来探索我们的方法,并评估与其他特殊替代方法相比,在模型拟合和参数估计方面的优势。这揭示了关于聚类比例反应风险量化的新见解。我们的方法可以使用标准的SAS软件程序来实现。与标准贝塔模型相比,增强型贝塔回归模型对聚类牙周比例数据的拟合效果更好。我们建议将其用作拟合比例数据的参数替代方法,并避免进行特殊数据转换。