Cadarso-Suárez Carmen, Roca-Pardiñas Javier, Figueiras Adolfo, González-Manteiga Wenceslao
Unit of Biostatistics, Department of Statistics and Operations Research, University of Santiago de Compostela, Spain.
Stat Med. 2005 Apr 30;24(8):1169-84. doi: 10.1002/sim.1978.
The generalized additive, model (GAM) is a powerful and widely used tool that allows researchers to fit, non-parametrically, the effect of continuous predictors on a transformation of the mean response variable. Such a transformation is given by a so-called link function, and in GAMs this link function is assumed to be known. Nevertheless, if an incorrect choice is made for the link, the resulting GAM is misspecified and the results obtained may be misleading. In this paper, we propose a modified version of the local scoring algorithm that allows for the non-parametric estimation of the link function, by using local linear kernel smoothers. To better understand the effect that each covariate produces on the outcome, results are expressed in terms of the non-parametric odds ratio (OR) curves. Bootstrap techniques were used to correct the bias in the OR estimation and to construct point-wise confidence intervals. A simulation study was carried out to assess the behaviour of the resulting estimates. The proposed methodology was illustrated using data from the AIDS Register of Galicia (NW Spain), with a view to assessing the effect of the CD4 lymphocyte count on the probability of being AIDS-diagnosed via Tuberculosis (TB). This application shows how the link's flexibility makes it possible to obtain OR curve estimates that are less sensitive to the presence of outliers and unusual values that are often present in the extremes of the covariate distributions.
广义相加模型(GAM)是一种功能强大且应用广泛的工具,它使研究人员能够以非参数方式拟合连续预测变量对平均响应变量变换的影响。这种变换由所谓的链接函数给出,在GAM中,假定该链接函数是已知的。然而,如果对链接做出错误选择,所得的GAM就会被误设,从而获得的结果可能会产生误导。在本文中,我们提出了一种局部评分算法的改进版本,该算法通过使用局部线性核平滑器对链接函数进行非参数估计。为了更好地理解每个协变量对结果产生的影响,结果以非参数优势比(OR)曲线表示。使用自助法技术来校正OR估计中的偏差并构建逐点置信区间。进行了一项模拟研究以评估所得估计值的性能。利用来自西班牙西北部加利西亚艾滋病登记处的数据对所提出的方法进行了说明,目的是评估CD4淋巴细胞计数对通过结核病(TB)被诊断为艾滋病的概率的影响。该应用展示了链接的灵活性如何使得能够获得对协变量分布极端值中经常出现的异常值和异常值不太敏感的OR曲线估计。