Durbin B P, Hardin J S, Hawkins D M, Rocke D M
Department of Statistics, UC Davis, Davis, CA 95616, USA.
Bioinformatics. 2002;18 Suppl 1:S105-10. doi: 10.1093/bioinformatics/18.suppl_1.s105.
Standard statistical techniques often assume that data are normally distributed, with constant variance not depending on the mean of the data. Data that violate these assumptions can often be brought in line with the assumptions by application of a transformation. Gene-expression microarray data have a complicated error structure, with a variance that changes with the mean in a non-linear fashion. Log transformations, which are often applied to microarray data, can inflate the variance of observations near background.
We introduce a transformation that stabilizes the variance of microarray data across the full range of expression. Simulation studies also suggest that this transformation approximately symmetrizes microarray data.
标准统计技术通常假定数据呈正态分布,具有不依赖于数据均值的恒定方差。违反这些假设的数据通常可以通过应用变换使其符合假设。基因表达微阵列数据具有复杂的误差结构,其方差以非线性方式随均值变化。经常应用于微阵列数据的对数变换会增大背景附近观测值的方差。
我们引入了一种变换,该变换可在整个表达范围内稳定微阵列数据的方差。模拟研究还表明,这种变换能使微阵列数据大致对称。