Department of Mathematical Sciences and School of Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex, CO4 3SQ, UK.
Nucleic Acids Res. 2012 Oct;40(19):9705-16. doi: 10.1093/nar/gks717. Epub 2012 Aug 16.
An Affymetrix GeneChip consists of an array of hundreds of thousands of probes (each a sequence of 25 bases) with the probe values being used to infer the extent to which genes are expressed in the biological material under investigation. In this article, we demonstrate that these probe values are also strongly influenced by their precise base sequence. We use data from >28 000 CEL files relating to 10 different Affymetrix GeneChip platforms and involving nearly 1000 experiments. Our results confirm known effects (those due to the T7-primer and the formation of G-quadruplexes) but reveal other effects. We show that there can be huge variations from one experiment to another, and that there may also be sizeable disparities between batches within an experiment and between CEL files within a batch.
Affymetrix GeneChip 由数以十万计的探针(每条探针序列长 25 个碱基)组成,通过分析这些探针的数值,可以推断出研究生物材料中基因的表达程度。在本文中,我们证明了这些探针数值也受到其碱基序列的强烈影响。我们使用了来自超过 28000 个 CEL 文件的数据,这些文件涉及 10 种不同的 Affymetrix GeneChip 平台,涉及近 1000 个实验。我们的结果证实了已知的影响因素(T7 引物和 G-四链体的形成),但也揭示了其他影响因素。我们发现,不同实验之间的差异可能非常大,而且同一实验中不同批次之间以及同一批次中不同 CEL 文件之间也可能存在相当大的差异。