Landfors Mattias, Fahlén Jessica, Rydén Patrik
Department of Mathematics and Mathematical Statistics, Umeå University, Sweden.
Stat Appl Genet Mol Biol. 2009;8:Article 42. doi: 10.2202/1544-6115.1459. Epub 2009 Oct 1.
Pre-processing plays a vital role in two-color microarray data analysis. An analysis is characterized by its ability to identify differentially expressed genes (its sensitivity) and its ability to provide unbiased estimators of the true regulation (its bias). It has been shown that microarray experiments regularly underestimate the true regulation of differentially expressed genes. We introduce the MC-normalization, where C stands for channel-wise normalization, with considerably lower bias than the commonly used standard methods. The idea behind the MC-normalization is that the channels' individual intensities determine the correction, rather than the average intensity which is the case for the widely used MA-normalization. The two methods were evaluated using spike-in data from an in-house produced cDNA-experiment and a publicly available Agilent-experiment. The methods were applied on background corrected and non-background corrected data. For the cDNA-experiment the methods were either applied separately on data from each of the print-tips or applied on the complete array data. Altogether 24 analyses were evaluated. For each analysis the sensitivity, the bias and two variance measures were estimated.We prove that the MC-normalization has lower bias than the MA-normalization. The spike-in data confirmed the theoretical result and suggest that the difference is significant. Furthermore, the empirical data suggest that the MC-and MA-normalization have similar sensitivity. A striking result is that print-tip normalizations did have considerably higher sensitivity than analyses using the complete array data.
预处理在双色微阵列数据分析中起着至关重要的作用。一项分析的特点在于其识别差异表达基因的能力(即灵敏度)以及提供真实调控无偏估计量的能力(即偏差)。研究表明,微阵列实验常常会低估差异表达基因的真实调控情况。我们引入了MC归一化方法,其中C代表按通道归一化,其偏差比常用的标准方法低得多。MC归一化背后的理念是,通道的个体强度决定校正,而不是像广泛使用的MA归一化那样由平均强度决定。使用来自内部产生的cDNA实验和公开可用的安捷伦实验的加标数据对这两种方法进行了评估。这些方法应用于背景校正和未校正的数据。对于cDNA实验,这些方法要么分别应用于每个打印尖端的数据,要么应用于完整的阵列数据。总共评估了24次分析。对于每次分析,估计了灵敏度、偏差和两个方差度量。我们证明MC归一化的偏差低于MA归一化。加标数据证实了理论结果,并表明差异显著。此外,经验数据表明MC归一化和MA归一化具有相似的灵敏度。一个显著的结果是,打印尖端归一化的灵敏度确实比使用完整阵列数据的分析高得多。