Suppr超能文献

残差与回归诊断:聚焦逻辑回归

Residuals and regression diagnostics: focusing on logistic regression.

作者信息

Zhang Zhongheng

机构信息

Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua Hospital of Zhejiang University, Jinhua 321000, China.

出版信息

Ann Transl Med. 2016 May;4(10):195. doi: 10.21037/atm.2016.03.36.

Abstract

Up to now I have introduced most steps in regression model building and validation. The last step is to check whether there are observations that have significant impact on model coefficient and specification. The article firstly describes plotting Pearson residual against predictors. Such plots are helpful in identifying non-linearity and provide hints on how to transform predictors. Next, I focus on observations of outlier, leverage and influence that may have significant impact on model building. Outlier is such an observation that its response value is unusual conditional on covariate pattern. Leverage is an observation with covariate pattern that is far away from the regressor space. Influence is the product of outlier and leverage. That is, when influential observation is dropped from the model, there will be a significant shift of the coefficient. Summary statistics for outlier, leverage and influence are studentized residuals, hat values and Cook's distance. They can be easily visualized with graphs and formally tested using the car package.

摘要

到目前为止,我已经介绍了回归模型构建和验证中的大部分步骤。最后一步是检查是否存在对模型系数和规格有重大影响的观测值。本文首先描述了绘制皮尔逊残差与预测变量的关系图。这样的图有助于识别非线性,并为如何转换预测变量提供线索。接下来,我将重点关注可能对模型构建有重大影响的异常值、杠杆值和影响力观测值。异常值是指在协变量模式下其响应值不寻常的观测值。杠杆值是指协变量模式远离回归变量空间的观测值。影响力是异常值和杠杆值的乘积。也就是说,当从模型中剔除有影响力的观测值时,系数会有显著变化。异常值、杠杆值和影响力的汇总统计量分别是学生化残差、帽子值和库克距离。它们可以很容易地用图形可视化,并使用car包进行正式检验。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验