Division of Biomedical Informatics, Department of Pediatrics, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
Mol Genet Metab. 2010 Mar;99(3):309-18. doi: 10.1016/j.ymgme.2009.10.179. Epub 2009 Oct 30.
Microarray expression profiling has become a valuable tool in the evaluation of the genetic consequences of metabolic disease. Although 3'-biased gene expression microarray platforms were the first generation to have widespread availability, newer platforms are gradually emerging that have more up-to-date content and/or higher cost efficiency. Deciphering the relative strengths and weaknesses of these various platforms for metabolic pathway-level analyses can be daunting. We sought to determine the practical strengths and weaknesses of four leading commercially available expression array platforms relative to biologic investigations, as well as assess the feasibility of cross-platform data integration for purposes of biochemical pathway analyses.
Liver RNA from B6.Alb/cre,Pdss2(loxP/loxP) mice having primary coenzyme Q deficiency was extracted either at baseline or following treatment with an antioxidant/antihyperlipidemic agent, probucol. Target RNA samples were prepared and hybridized to Affymetrix 430 2.0, Affymetrix Gene 1.0 ST, Affymetrix Exon 1.0 ST, and Illumina Mouse WG-6 expression arrays. Probes on all platforms were re-mapped to coding sequences in the current version of the mouse genome. Data processing and statistical analysis were performed by R/Bioconductor functions, and pathway analyses were carried out by KEGG Atlas and GSEA.
Expression measurements were generally consistent across platforms. However, intensive probe-level comparison suggested that differences in probe locations were a major source of inter-platform variance. In addition, genes expressed at low or intermediate levels had lower inter-platform reproducibility than highly expressed genes. All platforms showed similar patterns of differential expression between sample groups, with 'steroid biosynthesis' consistently identified as the most down-regulated metabolic pathway by probucol treatment.
This work offers a timely guide for metabolic disease investigators to enable informed end-user decisions regarding choice of expression microarray platform best-suited to specific research project goals. Successful cross-platform integration of biochemical pathway expression data is also demonstrated, especially for well-annotated and highly expressed genes. However, integration of gene-level expression data is limited by individual platform probe design and the expression level of target genes. Cross-platform analyses of biochemical pathway data will require additional data processing and novel computational bioinformatics tools to address unique statistical challenges.
微阵列表达谱分析已成为评估代谢性疾病遗传后果的一种有价值的工具。尽管 3'偏向性基因表达微阵列平台是第一代具有广泛可用性的平台,但新一代的平台正在逐渐出现,这些平台具有更新的内容和/或更高的成本效益。对于代谢途径水平的分析,要理解这些不同平台的相对优势和劣势可能会让人望而却步。我们旨在确定四种商业上可用的表达阵列平台相对于生物学研究的实际优势和劣势,以及评估跨平台数据整合用于生化途径分析的可行性。
从具有原发性辅酶 Q 缺乏症的 B6.Alb/cre,Pdss2(loxP/loxP)小鼠的肝 RNA 中提取基线或在用抗氧化剂/抗高血脂剂普罗布考治疗后的 RNA 样本。制备靶 RNA 样品并杂交至 Affymetrix 430 2.0、Affymetrix Gene 1.0 ST、Affymetrix Exon 1.0 ST 和 Illumina Mouse WG-6 表达阵列。所有平台上的探针均重新映射到当前版本的小鼠基因组中的编码序列。数据处理和统计分析通过 R/Bioconductor 函数进行,途径分析通过 KEGG Atlas 和 GSEA 进行。
表达测量通常在各个平台上保持一致。然而,密集的探针水平比较表明,探针位置的差异是平台间差异的主要来源。此外,低或中等水平表达的基因比高表达的基因具有更低的平台间可重复性。所有平台在样本组之间的差异表达模式都显示出相似的模式,普罗布考治疗后,“类固醇生物合成”始终被确定为最下调的代谢途径。
这项工作为代谢性疾病研究人员提供了及时的指南,使他们能够根据特定研究项目目标做出明智的用户选择表达微阵列平台的决策。还成功地展示了生化途径表达数据的跨平台集成,尤其是对于注释良好和高表达的基因。然而,基因水平表达数据的集成受到单个平台探针设计和目标基因表达水平的限制。生化途径数据的跨平台分析将需要额外的数据处理和新的计算生物信息学工具来解决独特的统计挑战。