Shepherd Bryan E, Li Chun, Liu Qi
Vanderbilt University School of Medicine.
Case Western Reserve University.
Can J Stat. 2016 Dec;44(4):463-479. doi: 10.1002/cjs.11302. Epub 2016 Aug 24.
We describe a new residual for general regression models, defined as ( < ) - ( > ), where is the observed outcome and is a random variable from the fitted distribution. This probability-scale residual can be written as {sign(, )} whereas the popular observed-minus-expected residual can be thought of as ( - ). Therefore, the probability-scale residual is useful in settings where differences are not meaningful or where the expectation of the fitted distribution cannot be calculated. We present several desirable properties of the probability-scale residual that make it useful for diagnostics and measuring residual correlation, especially across different outcome types. We demonstrate its utility for continuous, ordered discrete, and censored outcomes, including current status data, and with various models including Cox regression, quantile regression, and ordinal cumulative probability models, for which fully specified distributions are not desirable or needed, and in some cases suitable residuals are not available. The residual is illustrated with simulated data and real datasets from HIV-infected patients on therapy in the southeastern United States and Latin America.
我们描述了一种适用于一般回归模型的新残差,定义为( < ) - ( > ),其中 是观测结果, 是来自拟合分布的随机变量。这种概率尺度残差可以写成 {sign(, )},而流行的观测值减去期望值的残差可以认为是( - )。因此,概率尺度残差在差异无意义或无法计算拟合分布期望值的情况下很有用。我们给出了概率尺度残差的几个理想特性,这些特性使其在诊断和测量残差相关性方面很有用,特别是在不同结果类型之间。我们展示了它在连续、有序离散和删失结果(包括当前状态数据)方面的效用,以及在各种模型(包括Cox回归、分位数回归和有序累积概率模型)中的效用,对于这些模型,不需要或不需要完全指定的分布,并且在某些情况下没有合适的残差可用。我们用来自美国东南部和拉丁美洲感染艾滋病毒患者的模拟数据和真实数据集说明了这种残差。