Park Taesung, Yi Sung-Gon, Shin Young Kee, Lee SeungYeoun
Department of Statistics, College of Pharmacy, Seoul National University Seoul, Korea.
Bioinformatics. 2006 Jul 15;22(14):1682-9. doi: 10.1093/bioinformatics/btl183. Epub 2006 May 16.
Microarray technology enables the monitoring of expression levels for thousands of genes simultaneously. When the magnitude of the experiment increases, it becomes common to use the same type of microarrays from different laboratories or hospitals. Thus, it is important to analyze microarray data together to derive a combined conclusion after accounting for the differences. One of the main objectives of the microarray experiment is to identify differentially expressed genes among the different experimental groups. The analysis of variance (ANOVA) model has been commonly used to detect differentially expressed genes after accounting for the sources of variation commonly observed in the microarray experiment.
We extended the usual ANOVA model to account for an additional variability resulting from many confounding variables such as the effect of different hospitals. The proposed model is a two-stage ANOVA model. The first stage is the adjustment for the effects of no interests. The second stage is the detection of differentially expressed genes among the experimental groups using the residuals obtained from the first stage. Based on these residuals, we propose a permutation test to detect the differentially expressed genes. The proposed model is illustrated using the data from 133 microarrays collected at three different hospitals. The proposed approach is more flexible to use, and it is easier to accommodate the individual covariates in this model than using the meta-analysis approach.
A set of programs written in R will be electronically sent upon request.
微阵列技术能够同时监测数千个基因的表达水平。当实验规模扩大时,使用来自不同实验室或医院的同一类型微阵列变得很常见。因此,在考虑差异后对微阵列数据进行综合分析以得出联合结论非常重要。微阵列实验的主要目标之一是识别不同实验组之间差异表达的基因。方差分析(ANOVA)模型通常用于在考虑微阵列实验中常见的变异来源后检测差异表达的基因。
我们扩展了常用的ANOVA模型,以考虑许多混杂变量(如不同医院的影响)导致的额外变异性。所提出的模型是一个两阶段ANOVA模型。第一阶段是对无关效应进行调整。第二阶段是使用从第一阶段获得的残差检测实验组之间差异表达的基因。基于这些残差,我们提出了一种置换检验来检测差异表达的基因。使用在三家不同医院收集的133个微阵列数据说明了所提出的模型。所提出的方法使用起来更灵活,并且与使用荟萃分析方法相比,在该模型中更容易纳入个体协变量。
如有需要,将以电子方式发送一组用R编写的程序。