Weng Lee, Dai Hongyue, Zhan Yihui, He Yudong, Stepaniants Sergey B, Bassett Douglas E
Rosetta Inpharmatics LLC 401 Terry Avenue North, Seattle, WA 98109, USA.
Bioinformatics. 2006 May 1;22(9):1111-21. doi: 10.1093/bioinformatics/btl045. Epub 2006 Mar 7.
In microarray gene expression studies, the number of replicated microarrays is usually small because of cost and sample availability, resulting in unreliable variance estimation and thus unreliable statistical hypothesis tests. The unreliable variance estimation is further complicated by the fact that the technology-specific variance is intrinsically intensity-dependent.
The Rosetta error model captures the variance-intensity relationship for various types of microarray technologies, such as single-color arrays and two-color arrays. This error model conservatively estimates intensity error and uses this value to stabilize the variance estimation. We present two commonly used error models: the intensity error-model for single-color microarrays and the ratio error model for two-color microarrays or ratios built from two single-color arrays. We present examples to demonstrate the strength of our error models in improving statistical power of microarray data analysis, particularly, in increasing expression detection sensitivity and specificity when the number of replicates is limited.
在微阵列基因表达研究中,由于成本和样本可得性,重复微阵列的数量通常较少,这导致方差估计不可靠,进而使统计假设检验也不可靠。技术特异性方差本质上与强度相关,这一事实使不可靠的方差估计问题更加复杂。
Rosetta误差模型捕捉了各种类型微阵列技术(如单色阵列和双色阵列)的方差-强度关系。该误差模型保守地估计强度误差,并使用此值来稳定方差估计。我们提出了两种常用的误差模型:单色微阵列的强度误差模型以及双色微阵列或由两个单色阵列构建的比率的比率误差模型。我们给出了示例,以证明我们的误差模型在提高微阵列数据分析统计功效方面的优势,特别是在重复数量有限时提高表达检测的灵敏度和特异性。