Zhang Dabao, Wells Martin T, Smart Christine D, Fry William E
Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.
J Comput Biol. 2005 May;12(4):391-406. doi: 10.1089/cmb.2005.12.391.
Commonly accepted intensity-dependent normalization in spotted microarray studies takes account of measurement errors in the differential expression ratio but ignores measurement errors in the total intensity, although the definitions imply the same measurement error components are involved in both statistics. Furthermore, identification of differentially expressed genes is usually considered separately following normalization, which is statistically problematic. By incorporating the measurement errors in both total intensities and differential expression ratios, we propose a measurement-error model for intensity-dependent normalization and identification of differentially expressed genes. This model is also flexible enough to incorporate intra-array and inter-array effects. A Bayesian framework is proposed for the analysis of the proposed measurement-error model to avoid the potential risk of using the common two-step procedure. We also propose a Bayesian identification of differentially expressed genes to control the false discovery rate instead of the ad hoc thresholding of the posterior odds ratio. The simulation study and an application to real microarray data demonstrate promising results.
在点阵微阵列研究中,普遍接受的强度依赖性归一化考虑了差异表达比率中的测量误差,但忽略了总强度中的测量误差,尽管定义表明这两个统计量涉及相同的测量误差成分。此外,差异表达基因的识别通常在归一化之后单独进行考虑,这在统计学上存在问题。通过纳入总强度和差异表达比率中的测量误差,我们提出了一种用于强度依赖性归一化和差异表达基因识别的测量误差模型。该模型也足够灵活,能够纳入阵列内和阵列间效应。为了避免使用常见的两步法带来的潜在风险,我们提出了一个贝叶斯框架来分析所提出的测量误差模型。我们还提出了一种贝叶斯差异表达基因识别方法,以控制错误发现率,而不是对后验优势比进行临时阈值设定。模拟研究和对实际微阵列数据的应用展示了有前景的结果。