McGee Monnie, Chen Zhongxue
Southern Methodist University.
Stat Appl Genet Mol Biol. 2006;5:Article24. doi: 10.2202/1544-6115.1237. Epub 2006 Sep 23.
There are many methods of correcting microarray data for non-biological sources of error. Authors routinely supply software or code so that interested analysts can implement their methods. Even with a thorough reading of associated references, it is not always clear how requisite parts of the method are calculated in the software packages. However, it is important to have an understanding of such details, as this understanding is necessary for proper use of the output, or for implementing extensions to the model. In this paper, the calculation of parameter estimates used in Robust Multichip Average (RMA), a popular preprocessing algorithm for Affymetrix GeneChip brand microarrays, is elucidated. The background correction method for RMA assumes that the perfect match (PM) intensities observed result from a convolution of the true signal, assumed to be exponentially distributed, and a background noise component, assumed to have a normal distribution. A conditional expectation is calculated to estimate signal. Estimates of the mean and variance of the normal distribution and the rate parameter of the exponential distribution are needed to calculate this expectation. Simulation studies show that the current estimates are flawed; therefore, new ones are suggested. We examine the performance of preprocessing under the exponential-normal convolution model using several different methods to estimate the parameters.
有许多方法可用于校正微阵列数据中来自非生物学误差源的影响。作者通常会提供软件或代码,以便感兴趣的分析人员能够实现他们的方法。即便仔细研读相关参考文献,也并非总能清楚了解软件包中该方法的必要部分是如何计算的。然而,了解这些细节很重要,因为正确使用输出结果或对模型进行扩展都需要这种理解。在本文中,我们阐明了稳健多芯片平均法(RMA)中参数估计值的计算方法,RMA是用于Affymetrix GeneChip品牌微阵列的一种流行的预处理算法。RMA的背景校正方法假定观察到的完美匹配(PM)强度是由真实信号(假定为指数分布)与背景噪声分量(假定为正态分布)的卷积产生的。通过计算条件期望来估计信号。计算该期望需要正态分布的均值和方差估计值以及指数分布的速率参数估计值。模拟研究表明当前的估计值存在缺陷;因此,我们提出了新的估计值。我们使用几种不同的方法估计参数,研究了指数 - 正态卷积模型下预处理的性能。