Memon Farhat Naureen, Upton Graham J G, Harrison Andrew P
Departments of Mathematical Sciences and Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK.
J Nucleic Acids. 2010 Jul 7;2010:489736. doi: 10.4061/2010/489736.
We have previously discovered that probes containing runs of four or more contiguous guanines are not reliable for measuring gene expression in the Human HG_U133A Affymetrix GeneChip data. These probes are not correlated with other members of their probe set, but they are correlated with each other. We now extend our analysis to different 3' GeneChip designs of mouse, rat, and human. We find that, in all these chip designs, the G-stack probes (probes with a run of exactly four consecutive guanines) are correlated highly with each other, indicating that such probes are not reliable measures of gene expression in mammalian studies. Furthermore, there is no specific position of G-stack where the correlation is highest in all the chips. We also find that the latest designs of rat and mouse chips have significantly fewer G-stack probes compared to their predecessors, whereas there has not been a similar reduction in G-stack density across the changes in human chips. Moreover, we find significant changes in RMA values (after removing G-stack probes) as the number of G-stack probes increases.
我们之前发现,在人类HG_U133A Affymetrix基因芯片数据中,含有四个或更多连续鸟嘌呤的探针对于测量基因表达不可靠。这些探针与其探针集的其他成员不相关,但它们彼此相关。我们现在将分析扩展到小鼠、大鼠和人类的不同3'基因芯片设计。我们发现,在所有这些芯片设计中,G堆叠探针(恰好有四个连续鸟嘌呤的探针)彼此高度相关,这表明此类探针在哺乳动物研究中不是可靠的基因表达测量指标。此外,在所有芯片中,不存在G堆叠相关性最高的特定位置。我们还发现,与之前的芯片相比,大鼠和小鼠芯片的最新设计中G堆叠探针显著减少,而人类芯片在不同版本中G堆叠密度没有类似的降低。此外,我们发现随着G堆叠探针数量的增加,去除G堆叠探针后的RMA值有显著变化。