Experimental Rheumatology Unit, Department of Orthopedics, University Hospital Jena, Friedrich Schiller University, Jena, Germany.
BMC Med Genomics. 2012 Jun 8;5:23. doi: 10.1186/1755-8794-5-23.
Batch effects due to sample preparation or array variation (type, charge, and/or platform) may influence the results of microarray experiments and thus mask and/or confound true biological differences. Of the published approaches for batch correction, the algorithm "Combating Batch Effects When Combining Batches of Gene Expression Microarray Data" (ComBat) appears to be most suitable for small sample sizes and multiple batches.
Synovial fibroblasts (SFB; purity > 98%) were obtained from rheumatoid arthritis (RA) and osteoarthritis (OA) patients (n = 6 each) and stimulated with TNF-α or TGF-β1 for 0, 1, 2, 4, or 12 hours. Gene expression was analyzed using Affymetrix Human Genome U133 Plus 2.0 chips, an alternative chip definition file, and normalization by Robust Multi-Array Analysis (RMA). Data were batch-corrected for different acquiry dates using ComBat and the efficacy of the correction was validated using hierarchical clustering.
In contrast to the hierarchical clustering dendrogram before batch correction, in which RA and OA patients clustered randomly, batch correction led to a clear separation of RA and OA. Strikingly, this applied not only to the 0 hour time point (i.e., before stimulation with TNF-α/TGF-β1), but also to all time points following stimulation except for the late 12 hour time point. Batch-corrected data then allowed the identification of differentially expressed genes discriminating between RA and OA. Batch correction only marginally modified the original data, as demonstrated by preservation of the main Gene Ontology (GO) categories of interest, and by minimally changed mean expression levels (maximal change 4.087%) or variances for all genes of interest. Eight genes from the GO category "extracellular matrix structural constituent" (5 different collagens, biglycan, and tubulointerstitial nephritis antigen-like 1) were differentially expressed between RA and OA (RA > OA), both constitutively at time point 0, and at all time points following stimulation with either TNF-α or TGF-β1.
Batch correction appears to be an extremely valuable tool to eliminate non-biological batch effects, and allows the identification of genes discriminating between different joint diseases. RA-SFB show an upregulated expression of extracellular matrix components, both constitutively following isolation from the synovial membrane and upon stimulation with disease-relevant cytokines or growth factors, suggesting an "imprinted" alteration of their phenotype.
由于样本制备或芯片变异(类型、电荷和/或平台)引起的批次效应可能会影响微阵列实验的结果,从而掩盖和/或混淆真实的生物学差异。在已发表的用于批次校正的方法中,算法“在合并基因表达微阵列数据批次时对抗批次效应”(ComBat)似乎最适合小样本量和多个批次。
从类风湿关节炎(RA)和骨关节炎(OA)患者(n=6 例)中获得滑膜成纤维细胞(SFB;纯度>98%),并用 TNF-α或 TGF-β1 刺激 0、1、2、4 或 12 小时。使用 Affymetrix Human Genome U133 Plus 2.0 芯片、替代芯片定义文件和 Robust Multi-Array Analysis(RMA)进行基因表达分析。使用 ComBat 对不同采集日期的数据进行批次校正,并使用层次聚类验证校正的有效性。
与批次校正前的层次聚类树状图相反,在批次校正前,RA 和 OA 患者随机聚类,批次校正导致 RA 和 OA 清晰分离。引人注目的是,这不仅适用于 0 小时时间点(即,在用 TNF-α/TGF-β1 刺激之前),而且适用于刺激后除晚期 12 小时时间点之外的所有时间点。经过批次校正的数据随后可以识别出区分 RA 和 OA 的差异表达基因。批次校正仅略微修改了原始数据,这表现为保留了感兴趣的主要基因本体论(GO)类别,并且所有感兴趣基因的平均表达水平(最大变化 4.087%)或方差变化最小。GO 类别“细胞外基质结构成分”(5 种不同的胶原、biglycan 和 tubulointerstitial nephritis 抗原样 1)中有 8 个基因在 RA 和 OA 之间差异表达(RA>OA),在 0 时间点均为组成性表达,并且在用 TNF-α或 TGF-β1 刺激后所有时间点均为组成性表达。
批次校正似乎是消除非生物学批次效应的极其有价值的工具,并允许识别区分不同关节疾病的基因。RA-SFB 显示细胞外基质成分的表达上调,无论是在从滑膜膜中分离后还是在用与疾病相关的细胞因子或生长因子刺激后,都表现出组成性表达,这表明它们表型的“印迹”改变。