Rasmussen J L
Multivariate Behav Res. 1988 Apr 1;23(2):189-202. doi: 10.1207/s15327906mbr2302_4.
Comrey (1985) presented a statistic, Dk, to detect outliers. Its purported advantage over the more well-known Mahalanobis D squared is that it might be more sensitive to outliers that distort the correlation coefficient. The present study used a Monte Carlo simulation to compare Dk and D squared in terms of their hit and false alarm rates, their extent of overlap, and their effect on correlation coefficients resulting from outlier removal. The results indicated that D squared had a higher hit rate than Dk with approximately the same false alarm rate. The statistics identified the same cases as outliers 19 to 55 percent of the time. Surprising, the average correlations that resulted from outlier removal by D squared were closer to the population correlations than were those resulting from outlier removal by Dk. Under the conditions investigated, D squared was preferable to Dk as an outlier removal statistic.
康瑞(1985年)提出了一个统计量Dk来检测异常值。与更为知名的马氏D平方相比,它据称的优势在于可能对扭曲相关系数的异常值更敏感。本研究采用蒙特卡洛模拟,从命中率和误报率、重叠程度以及去除异常值后对相关系数的影响等方面,对Dk和D平方进行比较。结果表明,D平方的命中率高于Dk,且误报率大致相同。这两个统计量在19%至55%的时间内将相同的案例识别为异常值。令人惊讶的是,用D平方去除异常值后得到的平均相关性比用Dk去除异常值后得到的平均相关性更接近总体相关性。在所研究的条件下,作为一种去除异常值的统计量,D平方比Dk更可取。