Chu Tzu-Ming, Weir B S, Wolfinger Russell D
Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA.
Bioinformatics. 2004 Mar 1;20(4):500-6. doi: 10.1093/bioinformatics/btg435. Epub 2004 Jan 22.
Li and Wong have described some useful statistical models for probe-level, oligonucleotide array data based on a multiplicative parametrization. In earlier work, we proposed similar analysis-of-variance-style mixed models fit on a log scale. With only subtle differences in the specification of their mean and stochastic error components, a question arises as to whether these models could lead to varying conclusions in practical application.
In this paper, we provide an empirical comparison of the two models using a real data set, and find the models perform quite similarly across most genes, but with some interesting and important distinctions. We also present results from a simulation study designed to assess inferential properties of the models, and propose a modified test statistic for the Li-Wong model that provides an improvement in Type 1 error control. Advantages of both methods include the ability to directly assess and account for key sources of variability in the chip data and a means to automate statistical quality control.
李和王基于乘法参数化描述了一些用于探针水平寡核苷酸阵列数据的有用统计模型。在早期工作中,我们提出了类似的基于对数尺度拟合的方差分析式混合模型。由于它们的均值和随机误差成分的规范仅有细微差异,因此出现了一个问题,即这些模型在实际应用中是否会导致不同的结论。
在本文中,我们使用一个真实数据集对这两个模型进行了实证比较,发现这些模型在大多数基因上的表现非常相似,但也存在一些有趣且重要的差异。我们还展示了一项旨在评估模型推断性质的模拟研究结果,并为李 - 王模型提出了一种改进的检验统计量,该统计量在控制一类错误方面有所改进。这两种方法的优点包括能够直接评估和考虑芯片数据变异性的关键来源,以及实现统计质量控制自动化的手段。