Koren Amnon, Tirosh Itay, Barkai Naama
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
BMC Genomics. 2007 Jun 12;8:164. doi: 10.1186/1471-2164-8-164.
DNA microarrays provide the ability to interrogate multiple genes in a single experiment and have revolutionized genomic research. However, the microarray technology suffers from various forms of biases and relatively low reproducibility. A particular source of false data has been described, in which non-random placement of gene probes on the microarray surface is associated with spurious correlations between genes.
In order to assess the prevalence of this effect and better understand its origins, we applied an autocorrelation analysis of the relationship between chromosomal position and expression level to a database of over 2000 individual yeast microarray experiments. We show that at least 60% of these experiments exhibit spurious chromosomal position-dependent gene correlations, which nonetheless appear in a stochastic manner within each experimental dataset. Using computer simulations, we show that large spatial biases caused in the microarray hybridization step and independently of printing procedures can exclusively account for the observed spurious correlations, in contrast to previous suggestions. Our data suggest that such biases may generate more than 15% false data per experiment. Importantly, spatial biases are expected to occur regardless of microarray design and over a wide range of microarray platforms, organisms and experimental procedures.
Spatial biases comprise a major source of noise in microarray studies; revision of routine experimental practices and normalizations to account for these biases may significantly and comprehensively improve the quality of new as well as existing DNA microarray data.
DNA微阵列能够在单个实验中检测多个基因,彻底改变了基因组研究。然而,微阵列技术存在各种形式的偏差且重现性相对较低。已经描述了一种特定的错误数据来源,即基因探针在微阵列表面的非随机放置与基因之间的虚假相关性有关。
为了评估这种效应的普遍性并更好地理解其起源,我们将染色体位置与表达水平之间关系的自相关分析应用于一个包含2000多个酵母个体微阵列实验的数据库。我们表明,这些实验中至少60%表现出虚假的染色体位置依赖性基因相关性,尽管如此,这些相关性在每个实验数据集中以随机方式出现。通过计算机模拟,我们表明,与之前的观点相反,微阵列杂交步骤中产生的与打印程序无关的大空间偏差能够唯一地解释观察到的虚假相关性。我们的数据表明,这种偏差可能导致每个实验产生超过15%的错误数据。重要的是,无论微阵列设计如何,在广泛的微阵列平台、生物体和实验程序中都预计会出现空间偏差。
空间偏差是微阵列研究中的一个主要噪声来源;修订常规实验操作和标准化以考虑这些偏差可能会显著且全面地提高新的以及现有的DNA微阵列数据的质量。