Meloun M, Hill M, Militký J, Kupka K
Department of Analytical Chemistry, Faculty of Chemical Technology, University Pardubice, Czech Republic.
Clin Chem Lab Med. 2000 Jun;38(6):553-9. doi: 10.1515/CCLM.2000.081.
Data transformations enable expression of original data in a new scale, more suitable for data analysis. In computer-aided interactive analysis of biochemical and clinical data an exploratory data analysis often finds that the sample distribution is systematically skewed or does not accept a sample homogeneity. Under such circumstances the original data should be transformed. The power transformation and the Box-Cox transformation improve sample symmetry and also stabilize variance. Both the Hines-Hines selection graph and the plot of logarithm of a maximum likelihood function allow selection of an optimum transformation parameter. The proposed procedure of data transformation in univariate data analysis is illustrated on a determination of 17-hydroxypregnenolone in umbilical blood of a population of newborns. Lower levels of free 5-ene steroids in umbilical blood and elevated levels of 5-ene steroid sulfates indicate a congenital sex-specific placental sulfatase insufficiency. After examination of statistical assumptions by diagnostic plots of an exploratory data analysis the best estimate of a mean value of 17-hydroxypregnenolone is derived.
数据变换能够以一种新的尺度来表达原始数据,这种尺度更适合于数据分析。在生化和临床数据的计算机辅助交互式分析中,探索性数据分析常常发现样本分布存在系统性偏差或不满足样本同质性。在这种情况下,应变换原始数据。幂变换和Box-Cox变换可改善样本对称性并稳定方差。Hines-Hines选择图和最大似然函数的对数图都能用于选择最佳变换参数。在对一组新生儿脐血中17-羟孕烯醇酮的测定中,展示了单变量数据分析中所提出的数据变换程序。脐血中游离5-烯类固醇水平较低以及5-烯类固醇硫酸盐水平升高表明存在先天性性别特异性胎盘硫酸酯酶不足。通过探索性数据分析的诊断图检查统计假设后,得出了17-羟孕烯醇酮平均值的最佳估计值。