Wang Deli, Huang Jian, Xie Hehuang, Manzella Liliana, Soares Marcelo Bento
Biostatistics and Bioinformatics Unit, Comprehensive Cancer Center, the University of Alabama at Birmingham, Birmingham, AL 35294, USA.
BMC Bioinformatics. 2005 Jan 21;6:14. doi: 10.1186/1471-2105-6-14.
Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values.
We propose a robust semiparametric method in a two-way semi-linear model (TW-SLM) for normalization of cDNA microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that: (i) the percentage of differentially expressed genes is small; or (ii) the numbers of up- and down-regulated genes are about the same, as required in the LOWESS normalization method. We conduct simulation studies to evaluate the proposed method and use a real data set from a specially designed microarray experiment to compare the performance of the proposed method with that of the LOWESS normalization approach.
The simulation results show that the proposed method performs better than the LOWESS normalization method in terms of mean square errors for estimated gene effects. The results of analysis of the real data set also show that the proposed method yields more consistent results between the direct and the indirect comparisons and also can detect more differentially expressed genes than the LOWESS method.
Our simulation studies and the real data example indicate that the proposed robust TW-SLM method works at least as well as the LOWESS method and works better when the underlying assumptions for the LOWESS method are not satisfied. Therefore, it is a powerful alternative to the existing normalization methods.
标准化是微阵列数据分析中的一个基本步骤。适当的标准化程序可确保强度比提供相对表达值的有意义度量。
我们提出了一种用于cDNA微阵列数据标准化的双向半线性模型(TW-SLM)中的稳健半参数方法。该方法不做一些现有方法所基于的通常假设。例如,它不假设:(i)差异表达基因的百分比很小;或(ii)上调和下调基因的数量大致相同,这是LOWESS标准化方法所要求的。我们进行模拟研究以评估所提出的方法,并使用来自专门设计的微阵列实验的真实数据集来比较所提出的方法与LOWESS标准化方法的性能。
模拟结果表明,在所估计基因效应的均方误差方面,所提出的方法比LOWESS标准化方法表现更好。真实数据集的分析结果还表明,所提出的方法在直接和间接比较之间产生更一致的结果,并且比LOWESS方法能检测到更多的差异表达基因。
我们的模拟研究和真实数据示例表明,所提出的稳健TW-SLM方法至少与LOWESS方法一样有效,并且在不满足LOWESS方法的基本假设时表现更好。因此,它是现有标准化方法的有力替代方法。