Computational Life Science Cluster (CLiC), Department of Chemistry, Umeå University, SE-901 87 Umeå, Sweden.
Anal Chem. 2012 Oct 16;84(20):8675-81. doi: 10.1021/ac301869p. Epub 2012 Oct 1.
We have developed a multistep strategy that integrates data from several large-scale experiments that suffer from systematic between-experiment variation. This strategy removes such variation that would otherwise mask differences of interest. It was applied to the evaluation of wood chemical analysis of 736 hybrid aspen trees: wild-type controls and transgenic trees potentially involved in wood formation. The trees were grown in four different greenhouse experiments imposing significant variation between experiments. Pyrolysis coupled to gas chromatography/mass spectrometry (Py-GC/MS) was used as a high throughput-screening platform for fingerprinting of wood chemotype. Our proposed strategy includes quality control, outlier detection, gene specific classification, and consensus analysis. The orthogonal projections to latent structures discriminant analysis (OPLS-DA) method was used to generate the consensus chemotype profiles for each transgenic line. These were thereafter compiled to generate a global dataset. Multivariate analysis and cluster analysis techniques revealed a drastic reduction in between-experiment variation that enabled a global analysis of all transgenic lines from the four independent experiments. Information from in-depth analysis of specific transgenic lines and independent peak identification validated our proposed strategy.
我们开发了一种多步骤策略,该策略整合了来自几个大型实验的数据,这些实验受到系统的实验间变异性的影响。该策略消除了可能掩盖感兴趣差异的这种变异性。它应用于对 736 株杂交白杨树木的木材化学分析的评估:野生型对照和可能参与木材形成的转基因树木。这些树木在四个不同的温室实验中生长,这些实验之间存在显著的变异性。热解耦合并结合气相色谱/质谱法(Py-GC/MS)被用作木材化学型指纹图谱的高通量筛选平台。我们提出的策略包括质量控制、异常值检测、基因特异性分类和共识分析。正交投影到潜在结构判别分析(OPLS-DA)方法用于为每个转基因系生成共识化学型图谱。此后,这些图谱被编译以生成一个全局数据集。多元分析和聚类分析技术揭示了实验间变异性的大幅减少,从而能够对来自四个独立实验的所有转基因系进行全局分析。对特定转基因系的深入分析和独立峰识别的信息验证了我们提出的策略。