Park Taesung, Yi Sung-Gon, Kang Sung-Hyun, Lee SeungYeoun, Lee Yong-Sung, Simon Richard
Department of Statistics, Seoul National University, Seoul, Korea.
BMC Bioinformatics. 2003 Sep 2;4:33. doi: 10.1186/1471-2105-4-33.
Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. This novel technique helps us to understand gene regulation as well as gene by gene interactions more systematically. In the microarray experiment, however, many undesirable systematic variations are observed. Even in replicated experiment, some variations are commonly observed. Normalization is the process of removing some sources of variation which affect the measured gene expression levels. Although a number of normalization methods have been proposed, it has been difficult to decide which methods perform best. Normalization plays an important role in the earlier stage of microarray data analysis. The subsequent analysis results are highly dependent on normalization.
In this paper, we use the variability among the replicated slides to compare performance of normalization methods. We also compare normalization methods with regard to bias and mean square error using simulated data.
Our results show that intensity-dependent normalization often performs better than global normalization methods, and that linear and nonlinear normalization methods perform similarly. These conclusions are based on analysis of 36 cDNA microarrays of 3,840 genes obtained in an experiment to search for changes in gene expression profiles during neuronal differentiation of cortical stem cells. Simulation studies confirm our findings.
微阵列技术能够同时监测数千个基因的表达水平。这项新技术有助于我们更系统地理解基因调控以及基因间的相互作用。然而,在微阵列实验中,会观察到许多不良的系统变异。即便在重复实验中,一些变异也普遍存在。归一化是去除影响测量基因表达水平的某些变异来源的过程。尽管已经提出了许多归一化方法,但很难确定哪种方法表现最佳。归一化在微阵列数据分析的早期阶段起着重要作用。后续的分析结果高度依赖于归一化。
在本文中,我们利用重复玻片间的变异性来比较归一化方法的性能。我们还使用模拟数据在偏差和均方误差方面比较归一化方法。
我们的结果表明,强度依赖型归一化通常比全局归一化方法表现更好,并且线性和非线性归一化方法表现相似。这些结论基于对在一项寻找皮质干细胞神经元分化过程中基因表达谱变化的实验中获得的3840个基因的36个cDNA微阵列的分析。模拟研究证实了我们的发现。