Yu Lianbo, Gulati Parul, Fernandez Soledad, Pennell Michael, Kirschner Lawrence, Jarjoura David
The Ohio State University, USA.
Stat Appl Genet Mol Biol. 2011 Sep 15;10(1):/j/sagmb.2011.10.issue-1/1544-6115.1701/1544-6115.1701.xml. doi: 10.2202/1544-6115.1701.
Gene expression microarray experiments with few replications lead to great variability in estimates of gene variances. Several Bayesian methods have been developed to reduce this variability and to increase power. Thus far, moderated t methods assumed a constant coefficient of variation (CV) for the gene variances. We provide evidence against this assumption, and extend the method by allowing the CV to vary with gene expression. Our CV varying method, which we refer to as the fully moderated t-statistic, was compared to three other methods (ordinary t, and two moderated t predecessors). A simulation study and a familiar spike-in data set were used to assess the performance of the testing methods. The results showed that our CV varying method had higher power than the other three methods, identified a greater number of true positives in spike-in data, fit simulated data under varying assumptions very well, and in a real data set better identified higher expressing genes that were consistent with functional pathways associated with the experiments.
重复次数较少的基因表达微阵列实验会导致基因方差估计值出现很大的变异性。已经开发了几种贝叶斯方法来减少这种变异性并提高检验效能。到目前为止,适度t方法假定基因方差的变异系数(CV)是恒定的。我们提供了反对这一假设的证据,并通过允许CV随基因表达而变化来扩展该方法。我们将CV可变方法(我们称之为完全适度t统计量)与其他三种方法(普通t方法以及两种适度t方法的前身)进行了比较。使用模拟研究和一个常见的掺入数据组来评估检验方法的性能。结果表明,我们的CV可变方法比其他三种方法具有更高的检验效能,在掺入数据中识别出更多的真阳性,在不同假设下能很好地拟合模拟数据,并且在一个真实数据集中能更好地识别与实验相关功能途径一致的高表达基因。