Huber Wolfgang, von Heydebreck Anja, Sueltmann Holger, Poustka Annemarie, Vingron Martin
German Cancer Research Center, Heidelberg, Germany.
Stat Appl Genet Mol Biol. 2003;2:Article3. doi: 10.2202/1544-6115.1008. Epub 2003 Apr 5.
We derive and validate an estimator for the parameters of a transformation for the joint calibration (normalization) and variance stabilization of microarray intensity data. With this, the variances of the transformed intensities become approximately independent of their expected values. The transformation is similar to the logarithm in the high intensity range, but has a smaller slope for intensities close to zero. Applications have shown better sensitivity and specificity for the detection of differentially expressed genes. In this paper, we describe the theoretical aspects of the method. We incorporate calibration and variance-mean dependence into a statistical model and use a robust variant of the maximum-likelihood method to estimate the transformation parameters. Using simulations, we investigate the size of the estimation error and its dependence on sample size and the presence of outliers. We find that the error decreases with the square root of the number of probes per array and that the estimation is robust against the presence of differentially expressed genes. Software is publicly available as an R package through the Bioconductor project (http://www.bioconductor.org).
我们推导并验证了一种用于微阵列强度数据联合校准(归一化)和方差稳定化变换参数的估计器。通过这种方法,变换后强度的方差变得近似独立于其期望值。该变换在高强度范围内类似于对数,但在强度接近零时斜率较小。应用表明,在检测差异表达基因方面具有更好的敏感性和特异性。在本文中,我们描述了该方法的理论方面。我们将校准和方差 - 均值依赖性纳入统计模型,并使用最大似然方法的稳健变体来估计变换参数。通过模拟,我们研究了估计误差的大小及其对样本大小和异常值存在的依赖性。我们发现误差随着每个阵列探针数量的平方根而减小,并且估计对于差异表达基因的存在具有稳健性。软件可通过生物导体项目(http://www.bioconductor.org)作为R包公开获取。