Department of Preventive Medicine, Stony Brook University, Stony Brook, NY 11794, USA.
BMC Bioinformatics. 2011 Nov 1;12:427. doi: 10.1186/1471-2105-12-427.
Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either between-gene or within-gene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both between-gene and within-gene heteroscedasticity and use them to construct F-like test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here.
Our proposed test statistics are compared with other existing shrinkage-type test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of F or F-like test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected type-I errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic.
The F-like test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.
许多基因表达数据分析都涉及对两个固定效应之间的交互项的假设检验,通常使用残差方差进行检验。在表达研究中,方差异方差性问题受到了广泛关注,先前的工作主要集中在基因间或基因内异方差性上。然而,在单个实验中,基因内和基因间可能存在异方差性。在这里,我们开发了考虑基因间和基因内异方差性的灵活收缩误差估计器,并使用它们构建用于检验交互作用的 F 类似检验统计量,使用置换获得截断值。这些置换检验很复杂,这里研究了几种置换检验。
通过广泛的模拟研究和一个真实数据示例,将我们提出的检验统计量与其他现有的收缩型检验统计量进行比较。结果表明,置换程序的选择对检测能力的影响远远大于 F 或 F 类似检验统计量的选择。当存在两种基因异方差性时,我们提出的检验统计量可以控制预先选择的 I 型错误,并且更有效。在这种情况下,原始数据置换是无效的。是否应使用无限制或受限残差置换取决于特定的检验统计量类型。
使用考虑两种基因异方差性和无限制残差置换的灵活收缩误差估计器的 F 类似检验统计量可以提供一种具有统计学意义和强大的检验方法。因此,我们建议在分析真实基因表达数据时,应始终应用它来检验交互项。