Rotter Ana, Hren Matjaz, Baebler Spela, Blejec Andrej, Gruden Kristina
Department of Biotechnology and Systems Biology, National Institute of Biology, 1000 Ljubljana, Slovenia.
OMICS. 2008 Sep;12(3):171-82. doi: 10.1089/omi.2008.0032.
Due to the great variety of preprocessing tools in two-channel expression microarray data analysis it is difficult to choose the most appropriate one for a given experimental setup. In our study, two independent two-channel inhouse microarray experiments as well as a publicly available dataset were used to investigate the influence of the selection of preprocessing methods (background correction, normalization, and duplicate spots correlation calculation) on the discovery of differentially expressed genes. Here we are showing that both the list of differentially expressed genes and the expression values of selected genes depend significantly on the preprocessing approach applied. The choice of normalization method to be used had the highest impact on the results. We propose a simple but efficient approach to increase the reliability of obtained results, where two normalization methods which are theoretically distinct from one another are used on the same dataset. Then the intersection of results, that is, the lists of differentially expressed genes, is used in order to get a more accurate estimation of the genes that were de facto differentially expressed.
由于在双通道表达微阵列数据分析中预处理工具种类繁多,因此很难为给定的实验设置选择最合适的工具。在我们的研究中,使用了两个独立的双通道内部微阵列实验以及一个公开可用的数据集,来研究预处理方法(背景校正、标准化和重复点相关性计算)的选择对差异表达基因发现的影响。在这里我们表明,差异表达基因列表以及所选基因的表达值都显著取决于所应用的预处理方法。所使用的标准化方法的选择对结果影响最大。我们提出了一种简单但有效的方法来提高所得结果的可靠性,即在同一数据集上使用两种理论上不同的标准化方法。然后使用结果的交集,即差异表达基因列表,以便更准确地估计实际差异表达的基因。