Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC 3052, Australia.
BMC Bioinformatics. 2009 Nov 11;10:372. doi: 10.1186/1471-2105-10-372.
Illumina Sentrix-6 Whole-Genome Expression BeadChips are relatively new microarray platforms which have been used in many microarray studies in the past few years. These Chips have a unique design in which each Chip contains six microarrays and each microarray consists of two separate physical strips, posing special challenges for precise between-array normalization of expression values.
None of the normalization strategies proposed so far for this microarray platform allow for the possibility of systematic variation between the two strips comprising each array. That this variation can be substantial is illustrated by a data example. We demonstrate that normalizing at the strip-level rather than at the array-level can effectively remove this between-strip variation, improve the precision of gene expression measurements and discover more differentially expressed genes. The gain is substantial, yielding a 20% increase in statistical information and doubling the number of genes detected at a 5% false discovery rate. Functional analysis reveals that the extra genes found tend to have interesting biological meanings, dramatically strengthening the biological conclusions from the experiment. Strip-level normalization still outperforms array-level normalization when non-expressed probes are filtered out.
Plots are proposed which demonstrate how the need for strip-level normalization relates to inconsistent intensity range variation between the strips. Strip-level normalization is recommended for the preprocessing of Illumina Sentrix-6 BeadChips whenever the intensity range is seen to be inconsistent between the strips. R code is provided to implement the recommended plots and normalization algorithms.
Illumina Sentrix-6 全基因组表达芯片是相对较新的微阵列平台,在过去几年中已经在许多微阵列研究中得到了应用。这些芯片具有独特的设计,每个芯片包含六个微阵列,每个微阵列由两个独立的物理条带组成,这对表达值的精确阵列间标准化提出了特殊挑战。
迄今为止,为这个微阵列平台提出的所有标准化策略都不允许存在每个阵列组成的两个条带之间的系统变化。通过一个数据示例说明了这种变化的幅度。我们证明,在条带水平而不是在阵列水平上进行标准化可以有效地去除这种条带间的变化,提高基因表达测量的精度,并发现更多差异表达的基因。收益显著,统计信息量增加了 20%,在 5%的假发现率下检测到的基因数量增加了一倍。功能分析表明,发现的额外基因往往具有有趣的生物学意义,大大增强了实验的生物学结论。即使过滤掉非表达探针,条带水平的标准化仍然优于阵列水平的标准化。
提出了一些图来说明条带水平标准化的需求与条带之间不一致的强度范围变化之间的关系。只要观察到条带之间的强度范围不一致,就建议对 Illumina Sentrix-6 芯片进行条带水平的标准化预处理。提供了 R 代码来实现推荐的图和标准化算法。