Liu Qi, Shepherd Bryan E, Li Chun, Harrell Frank E
Department of Biostatistics, Vanderbilt University, Nashville, TN 37203, USA.
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA.
Stat Med. 2017 Nov 30;36(27):4316-4335. doi: 10.1002/sim.7433. Epub 2017 Sep 5.
We study the application of a widely used ordinal regression model, the cumulative probability model (CPM), for continuous outcomes. Such models are attractive for the analysis of continuous response variables because they are invariant to any monotonic transformation of the outcome and because they directly model the cumulative distribution function from which summaries such as expectations and quantiles can easily be derived. Such models can also readily handle mixed type distributions. We describe the motivation, estimation, inference, model assumptions, and diagnostics. We demonstrate that CPMs applied to continuous outcomes are semiparametric transformation models. Extensive simulations are performed to investigate the finite sample performance of these models. We find that properly specified CPMs generally have good finite sample performance with moderate sample sizes, but that bias may occur when the sample size is small. Cumulative probability models are fairly robust to minor or moderate link function misspecification in our simulations. For certain purposes, the CPMs are more efficient than other models. We illustrate their application, with model diagnostics, in a study of the treatment of HIV. CD4 cell count and viral load 6 months after the initiation of antiretroviral therapy are modeled using CPMs; both variables typically require transformations, and viral load has a large proportion of measurements below a detection limit.
我们研究一种广泛使用的有序回归模型——累积概率模型(CPM)在连续型结局中的应用。这类模型对于连续响应变量的分析具有吸引力,原因在于它们对结局的任何单调变换都具有不变性,并且它们直接对累积分布函数进行建模,由此可以轻松推导出诸如期望和分位数等汇总统计量。这类模型还能够轻松处理混合型分布。我们描述了其动机、估计方法、推断过程、模型假设以及诊断方法。我们证明应用于连续型结局的CPM是半参数变换模型。我们进行了广泛的模拟以研究这些模型的有限样本性能。我们发现,正确设定的CPM在中等样本量时通常具有良好的有限样本性能,但在样本量较小时可能会出现偏差。在我们的模拟中,累积概率模型对于轻微或中等程度的连接函数误设相当稳健。出于某些目的,CPM比其他模型更有效。我们在一项关于艾滋病治疗的研究中展示了它们的应用以及模型诊断。使用CPM对抗逆转录病毒治疗开始6个月后的CD4细胞计数和病毒载量进行建模;这两个变量通常都需要进行变换,并且病毒载量有很大一部分测量值低于检测限。