Eledum Hussein
Department of Statistics, University of Tabuk, Saudi Arabia.
Heliyon. 2021 Aug 17;7(8):e07792. doi: 10.1016/j.heliyon.2021.e07792. eCollection 2021 Aug.
In the process of building a linear regression model, the essential part is to identify influential observations. Various influence measures involving Cook's distance and DFFITS are designed to detect the linear regression's influential observations using the Least Squares (LS). The existence of influential observations in the data is complicated by the presence of severe collinearity and affects the efficiency of the detection measures. This paper proposes new diagnostic methods based on the Liu type estimator (LTE) defined by Liu [1]. The Cook's distance and DFFITS for the LTE are introduced. Moreover, approximate formulas for Cook's distance and DFFITS are also proposed for LTE. Two real data sets with a high level of multicollinearity among the explanatory variables as well as the simulation study are used to illustrate and evaluate performance of the methodologies presented in this paper.
在构建线性回归模型的过程中,关键部分是识别有影响力的观测值。设计了各种涉及库克距离和DFFITS的影响度量,以使用最小二乘法(LS)检测线性回归中的有影响力的观测值。数据中存在有影响力的观测值会因严重共线性的存在而变得复杂,并影响检测度量的效率。本文基于Liu[1]定义的Liu型估计器(LTE)提出了新的诊断方法。介绍了LTE的库克距离和DFFITS。此外,还为LTE提出了库克距离和DFFITS的近似公式。使用两个解释变量之间具有高度多重共线性的真实数据集以及模拟研究来说明和评估本文提出的方法的性能。