一种用于临床实验室数据的多阶段高斯变换算法。

A multi-stage Gaussian transformation algorithm for clinical laboratory data.

作者信息

Boyd J C, Lacher D A

出版信息

Clin Chem. 1982 Aug;28(8):1735-41.

Abstract

We have developed a multi-stage computer algorithm to transform non-normally distributed data to a normal distribution. This transformation is of value for calculation of laboratory reference intervals and for normalization of clinical laboratory variates before applying statistical procedures in which underlying data normality is assumed. The algorithm is able to normalize most laboratory data distributions with either negative or positive coefficients of skewness or kurtosis. Stepwise, a logarithmic transform removes asymmetry (skewness), then a Z-score transform and power function transform remove residual peakedness or flatness (kurtosis). Powerful statistical tests of data normality in the procedure help the user evaluate both the necessity for and the success of the data transformation. Erroneous assessments of data normality caused by rounded laboratory test values have been minimized by introducing computer-generated random noise into the data values. Reference interval endpoints that were estimated parametrically (mean +/- 2 SD) by using successfully transformed data were found to have a smaller root-mean-squared error than those estimated by the non-parametric percentile technique.

摘要

我们开发了一种多阶段计算机算法，用于将非正态分布数据转换为正态分布。这种转换对于计算实验室参考区间以及在应用假定基础数据呈正态分布的统计程序之前对临床实验室变量进行标准化具有重要价值。该算法能够对大多数具有负或正偏度或峰度系数的实验室数据分布进行标准化。逐步地，对数变换消除不对称性（偏度），然后Z分数变换和幂函数变换消除残余的尖峰或扁平度（峰度）。该过程中强大的数据正态性统计检验有助于用户评估数据转换的必要性和成功性。通过将计算机生成的随机噪声引入数据值，已将因实验室测试值四舍五入导致的数据正态性错误评估降至最低。发现使用成功转换后的数据通过参数估计（均值±2标准差）得到的参考区间端点比通过非参数百分位数技术估计的端点具有更小的均方根误差。