Robinson Mark D, Speed Terence P
Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC 3050, Australia.
BMC Bioinformatics. 2007 Nov 15;8:449. doi: 10.1186/1471-2105-8-449.
Affymetrix GeneChips are an important tool in many facets of biological research. Recently, notable design changes to the chips have been made. In this study, we use publicly available data from Affymetrix to gauge the performance of three human gene expression arrays: Human Genome U133 Plus 2.0 (U133), Human Exon 1.0 ST (HuEx) and Human Gene 1.0 ST (HuGene).
We studied probe-, exon- and gene-level reproducibility of technical and biological replicates from each of the 3 platforms. The U133 array has larger feature sizes so it is no surprise that probe-level variances are smaller, however the larger number of probes per gene on the HuGene array seems to produce gene-level summaries that have similar variances. The gene-level summaries of the HuEx array are less reproducible than the other two, despite having the largest average number of probes per gene. Greater than 80% of the content on the HuEx arrays is expressed at or near background. Biological variation seems to have a smaller effect on U133 data. Comparing the overlap of differentially expressed genes, we see a high overall concordance among all 3 platforms, with HuEx and HuGene having greater overlap, as expected given their design. We performed an analysis of detection rates and area under ROC curves using an experiment made up of several mixtures of 2 human tissues. Though it appears that the HuEx array has worse performance in terms of detection rates, all arrays have similar ability to separate differentially expressed and non-differentially expressed genes.
Despite noticeable differences in the probe-level reproducibility, gene-level reproducibility and differential expression detection are quite similar across the three platforms. The HuEx array, an all-encompassing array, has the flexibility of measuring all known or predicted exonic content. However, the HuEx array induces poorer reproducibility for genes with fewer exons. The HuGene measures just the well-annotated genome content and appears to perform well. The U133 array, though not able to measure across the full length of a transcript, appears to perform as well as the newer designs on the set of genes common to all 3 platforms.
Affymetrix基因芯片是生物研究诸多方面的重要工具。最近,芯片有了显著的设计变化。在本研究中,我们使用Affymetrix的公开可用数据来评估三种人类基因表达阵列的性能:人类基因组U133 Plus 2.0(U133)、人类外显子1.0 ST(HuEx)和人类基因1.0 ST(HuGene)。
我们研究了来自这3个平台中每个平台的技术重复和生物学重复在探针、外显子和基因水平上的可重复性。U133阵列的特征尺寸较大,所以探针水平的方差较小并不奇怪,然而HuGene阵列上每个基因的探针数量较多,似乎能产生具有相似方差的基因水平汇总。HuEx阵列的基因水平汇总的可重复性比其他两个阵列差,尽管其每个基因的平均探针数量最多。HuEx阵列上超过80%的内容在背景水平或接近背景水平表达。生物学变异对U133数据的影响似乎较小。比较差异表达基因的重叠情况,我们发现所有3个平台之间总体一致性较高,正如预期的那样,鉴于HuEx和HuGene的设计,它们之间的重叠更大。我们使用由两种人类组织的几种混合物组成的实验进行了检测率和ROC曲线下面积的分析。虽然看起来HuEx阵列在检测率方面表现较差,但所有阵列在区分差异表达基因和非差异表达基因方面具有相似的能力。
尽管在探针水平可重复性方面存在显著差异,但这三个平台在基因水平可重复性和差异表达检测方面非常相似。HuEx阵列是一种全面的阵列,具有测量所有已知或预测外显子内容的灵活性。然而,HuEx阵列对外显子较少的基因诱导出较差的可重复性。HuGene仅测量注释良好的基因组内容,并且似乎表现良好。U133阵列虽然无法测量转录本的全长,但在所有3个平台共有的基因集上似乎与新设计表现得一样好。