Fujita André, Sato João Ricardo, Rodrigues Leonardo de Oliveira, Ferreira Carlos Eduardo, Sogayar Mari Cleide
Institute of Mathematics and Statistics, University of São Paulo, Rua do Matão, 1010--São Paulo, 05508-090 SP, Brazil.
BMC Bioinformatics. 2006 Oct 23;7:469. doi: 10.1186/1471-2105-7-469.
With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration.
Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets.
In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.
随着DNA杂交微阵列技术的发展,如今能够同时评估数千到数万个基因的表达水平。微阵列的定量比较揭示了不同的基因表达模式,这些模式定义了不同的细胞表型或细胞对药物的反应。由于技术偏差,强度水平的标准化是进行进一步统计分析的先决条件。因此,选择合适的标准化方法至关重要,值得审慎考虑。
在这里,我们考虑了三种常用的标准化方法,即:局部加权回归(Loess)、样条法(Splines)和小波法(Wavelets),以及两种尚未用于标准化的非参数回归方法,即核平滑法(Kernel smoothing)和支持向量回归法(Support Vector Regression)。使用人工微阵列数据和基准研究对所得结果进行比较。结果表明,支持向量回归法对异常值最具鲁棒性,而核平滑法是最差的标准化技术,而在局部加权回归、样条法和小波法之间未观察到实际差异。
基于我们的结果,支持向量回归法因其在估计标准化曲线时的稳健性优于其他方法,而更适合用于微阵列标准化。