Touloumis Anestis, Tavaré Simon, Marioni John C
Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, U.K.
EMBL-European Bioinformatics Institute, Hinxton CB10 1SD, U.K.
Biometrics. 2015 Mar;71(1):157-166. doi: 10.1111/biom.12257. Epub 2015 Jan 23.
The structural information in high-dimensional transposable data allows us to write the data recorded for each subject in a matrix such that both the rows and the columns correspond to variables of interest. One important problem is to test the null hypothesis that the mean matrix has a particular structure without ignoring the dependence structure among and/or between the row and column variables. To address this, we develop a generic and computationally inexpensive nonparametric testing procedure to assess the hypothesis that, in each predefined subset of columns (rows), the column (row) mean vector remains constant. In simulation studies, the proposed testing procedure seems to have good performance and, unlike simple practical approaches, it preserves the nominal size and remains powerful even if the row and/or column variables are not independent. Finally, we illustrate the use of the proposed methodology via two empirical examples from gene expression microarrays.
高维可转座数据中的结构信息使我们能够将为每个受试者记录的数据写成一个矩阵,使得行和列都对应于感兴趣的变量。一个重要的问题是检验原假设,即均值矩阵具有特定结构,同时不忽略行变量和列变量之间和/或内部的依赖结构。为了解决这个问题,我们开发了一种通用且计算成本低廉的非参数检验程序,以评估以下假设:在每个预定义的列(行)子集中,列(行)均值向量保持不变。在模拟研究中,所提出的检验程序似乎具有良好的性能,并且与简单的实用方法不同,它能保持名义显著性水平,即使行和/或列变量不独立也依然具有强大的检验力。最后,我们通过基因表达微阵列的两个实证例子来说明所提出方法的应用。