Methodology and Statistics Department, Institute of Psychology, Leiden University.
J Exp Psychol Gen. 2023 Jun;152(6):1735-1753. doi: 10.1037/xge0001357. Epub 2023 Apr 27.
Researchers often remove outliers when comparing groups. It is well documented that the common practice of removing outliers within groups leads to inflated Type I error rates. However, it was recently argued by André (2022) that if outliers are instead removed across groups, Type I error rates are not inflated. The same study discusses that removing outliers across groups is a specific case of the more general concept of hypothesis-blind removal of outliers, which is consequently recommended. In this paper, I demonstrate that, contrary to this advice, hypothesis-blind outlier removal is problematic. Specifically, it almost always invalidates confidence intervals and biases estimates if there are group differences. It moreover inflates Type I error rates in certain situations, for example, when variances are unequal and data nonnormal. Consequently, a data point may not be removed solely because it is deemed an outlier, whether the procedure used is hypothesis-blind or hypothesis-aware. I conclude by recommending valid alternatives. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
研究人员在比较组时通常会剔除异常值。有大量文献记录表明,在组内剔除异常值的常见做法会导致Ⅰ型错误率膨胀。然而,安德烈(2022)最近提出,如果跨组剔除异常值,Ⅰ型错误率不会膨胀。同一项研究还讨论了跨组剔除异常值是更普遍的异常值盲剔除假设概念的一个特例,因此被推荐使用。在本文中,我证明了与这一建议相反,异常值盲剔除是有问题的。具体来说,如果存在组间差异,它几乎总是会使置信区间无效并偏倚估计。此外,在某些情况下,例如方差不等和数据非正态时,它会使Ⅰ型错误率膨胀。因此,一个数据点不应仅因其被视为异常值而被剔除,无论使用的程序是异常值盲剔除还是异常值意识剔除。最后,我推荐了有效的替代方案。(PsycInfo 数据库记录(c)2023 APA,保留所有权利)。