Neuroregeneration Laboratory, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA Amsterdam, the Netherlands.
BMC Genomics. 2010 Feb 17;11:112. doi: 10.1186/1471-2164-11-112.
Ratio-based analysis is the current standard for the analysis of dual-color microarray data. Indeed, this method provides a powerful means to account for potential technical variations such as differences in background signal, spot size and spot concentration. However, current high density dual-color array platforms are of very high quality, and inter-array variance has become much less pronounced. We therefore raised the question whether it is feasible to use an intensity-based analysis rather than ratio-based analysis of dual-color microarray datasets. Furthermore, we compared performance of both ratio- and intensity-based analyses in terms of reproducibility and sensitivity for differential gene expression.
By analyzing three distinct and technically replicated datasets with either ratio- or intensity-based models, we determined that, when applied to the same dataset, intensity-based analysis of dual-color gene expression experiments yields 1) more reproducible results, and 2) is more sensitive in the detection of differentially expressed genes. These effects were most pronounced in experiments with large biological variation and complex hybridization designs. Furthermore, a power analysis revealed that for direct two-group comparisons above a certain sample size, ratio-based models have higher power, although the difference with intensity-based models is very small.
Intensity-based analysis of dual-color datasets results in more reproducible results and increased sensitivity in the detection of differential gene expression than the analysis of the same dataset with ratio-based analysis. Complex dual-color setups such as interwoven loop designs benefit most from ignoring the array factor. The applicability of our approach to array platforms other than dual-color needs to be further investigated.
基于比率的分析是目前分析双色微阵列数据的标准方法。实际上,这种方法为解释潜在的技术变化提供了有力的手段,例如背景信号、斑点大小和斑点浓度的差异。然而,目前高密度双色阵列平台质量非常高,并且阵列之间的方差变得不那么明显。因此,我们提出了这样一个问题,即是否可以使用基于强度的分析而不是基于比率的分析来处理双色微阵列数据集。此外,我们比较了基于比率和基于强度的分析在差异基因表达的重现性和敏感性方面的性能。
通过用基于比率或基于强度的模型分析三个不同的、技术上重复的数据集,我们确定,当应用于相同的数据集时,双色基因表达实验的基于强度的分析产生了 1)更可重现的结果,和 2)在检测差异表达基因方面更敏感。这些效果在具有大生物学变异和复杂杂交设计的实验中最为明显。此外,功效分析表明,对于直接两组比较,当样本量超过一定大小时,基于比率的模型具有更高的功效,尽管与基于强度的模型的差异非常小。
与基于比率的分析相比,基于强度的分析对双色数据集的分析会产生更可重现的结果,并且在检测差异表达基因方面更敏感。复杂的双色设置,如交织环设计,从忽略阵列因素中获益最多。我们的方法对其他类型的阵列平台的适用性需要进一步研究。