Division of Biostatistics, University of Minnesota, Minneapolis, MN 55455, USA.
J Genet Genomics. 2010 Apr;37(4):265-79. doi: 10.1016/S1673-8527(09)60045-X.
This research provides a new way to measure error in microarray data in order to improve gene expression analysis. Microarray data contains many sources of error. In order to glean information about mRNA expression levels, the true signal must first be segregated from noise. This research focuses on the variation that can be captured at the spot level in cDNA microarray images. Variation at other levels, due to differences at the array, dye, and block levels, can be corrected for by a variety of existing normalization procedures. Two signal quality estimates that capture the reliability of each spot printed on a microarray are described. A parametric estimate of within-spot variance, referred to here as sigma(2)(spot), assumes that pixels follow a normal distribution and are spatially correlated. A non-parametric estimate of error, called the mean square prediction error (MSPE), assumes that spots of high quality possess pixels that are similar to their neighbors. This paper will provide a framework to use either spot quality measure in downstream analysis, specifically as weights in regression models. Using these spot quality estimates as weights can result in greater efficiency, in a statistical sense, when modeling microarray data.
这项研究提供了一种新的方法来测量微阵列数据中的误差,以便改进基因表达分析。微阵列数据包含许多误差源。为了获取有关 mRNA 表达水平的信息,必须首先将真实信号与噪声分离。本研究专注于 cDNA 微阵列图像中可以在斑点水平捕获的变化。由于阵列、染料和块水平的差异而导致的其他水平的变化,可以通过各种现有的归一化程序来校正。描述了两种可以捕获微阵列上每个斑点打印可靠性的信号质量估计。一种称为 sigma(2)(spot)的参数化估计,它假设像素遵循正态分布且空间相关。一种称为均方预测误差 (MSPE)的非参数化误差估计,它假设高质量的斑点具有与其邻居相似的像素。本文将提供一个框架,以便在下游分析中使用这两种斑点质量度量,特别是作为回归模型中的权重。在对微阵列数据进行建模时,使用这些斑点质量估计作为权重可以在统计意义上提高效率。