Solberg Helge Erik, Lahti Ari
Department of Medical Biochemistry, Rikshospitalet-Radiumhospitalet HF, Oslo, Norway.
Clin Chem. 2005 Dec;51(12):2326-32. doi: 10.1373/clinchem.2005.058339. Epub 2005 Oct 13.
Medical laboratory reference data may be contaminated with outliers that should be eliminated before estimation of the reference interval. A statistical test for outliers has been proposed by Paul S. Horn and coworkers (Clin Chem 2001;47:2137-45). The algorithm operates in 2 steps: (a) mathematically transform the original data to approximate a gaussian distribution; and (b) establish detection limits (Tukey fences) based on the central part of the transformed distribution.
We studied the specificity of Horn's test algorithm (probability of false detection of outliers), using Monte Carlo computer simulations performed on 13 types of probability distributions covering a wide range of positive and negative skewness. Distributions with 3% of the original observations replaced by random outliers were used to also examine the sensitivity of the test (probability of detection of true outliers). Three data transformations were used: the Box and Cox function (used in the original Horn's test), the Manly exponential function, and the John and Draper modulus function.
For many of the probability distributions, the specificity of Horn's algorithm was rather poor compared with the theoretical expectation. The cause for such poor performance was at least partially related to remaining nongaussian kurtosis (peakedness). The sensitivity showed great variation, dependent on both the type of underlying distribution and the location of the outliers (upper and/or lower tail).
Although Horn's algorithm undoubtedly is an improvement compared with older methods for outlier detection, reliable statistical identification of outliers in reference data remains a challenge.
医学实验室参考数据可能会受到异常值的污染,在估计参考区间之前应将其剔除。Paul S. Horn及其同事提出了一种用于检测异常值的统计检验方法(《临床化学》2001年;47:2137 - 45)。该算法分两步运行:(a) 对原始数据进行数学变换,使其近似高斯分布;(b) 根据变换后分布的中心部分确定检测限(Tukey界限)。
我们使用蒙特卡罗计算机模拟研究了Horn检验算法的特异性(误检异常值的概率),模拟针对13种概率分布进行,涵盖了广泛的正负偏度范围。用3%的原始观测值被随机异常值替换后的分布来检验该检验的灵敏度(检测真实异常值的概率)。使用了三种数据变换:Box和Cox函数(用于原始的Horn检验)、Manly指数函数以及John和Draper模量函数。
对于许多概率分布,与理论预期相比,Horn算法的特异性相当差。这种不佳表现的原因至少部分与剩余的非高斯峰度(尖峰性)有关。灵敏度表现出很大差异,这取决于基础分布的类型以及异常值的位置(上尾和/或下尾)。
尽管与旧的异常值检测方法相比,Horn算法无疑是一种改进,但在参考数据中可靠地统计识别异常值仍然是一个挑战。