Department of Computer Sciences, University of Wisconsin-Madison, USA.
Department of Computer Sciences, University of Wisconsin-Madison, USA.
Neuroimage. 2017 Oct 1;159:79-98. doi: 10.1016/j.neuroimage.2017.07.025. Epub 2017 Jul 15.
Permutation testing is a non-parametric method for obtaining the max null distribution used to compute corrected p-values that provide strong control of false positives. In neuroimaging, however, the computational burden of running such an algorithm can be significant. We find that by viewing the permutation testing procedure as the construction of a very large permutation testing matrix, T, one can exploit structural properties derived from the data and the test statistics to reduce the runtime under certain conditions. In particular, we see that T is low-rank plus a low-variance residual. This makes T a good candidate for low-rank matrix completion, where only a very small number of entries of T (∼0.35% of all entries in our experiments) have to be computed to obtain a good estimate. Based on this observation, we present RapidPT, an algorithm that efficiently recovers the max null distribution commonly obtained through regular permutation testing in voxel-wise analysis. We present an extensive validation on a synthetic dataset and four varying sized datasets against two baselines: Statistical NonParametric Mapping (SnPM13) and a standard permutation testing implementation (referred as NaivePT). We find that RapidPT achieves its best runtime performance on medium sized datasets (50≤n≤200), with speedups of 1.5× - 38× (vs. SnPM13) and 20x-1000× (vs. NaivePT). For larger datasets (n≥200) RapidPT outperforms NaivePT (6× - 200×) on all datasets, and provides large speedups over SnPM13 when more than 10000 permutations (2× - 15×) are needed. The implementation is a standalone toolbox and also integrated within SnPM13, able to leverage multi-core architectures when available.
置换检验是一种获取最大零分布的非参数方法,用于计算校正后的 p 值,以提供对假阳性的强有力控制。然而,在神经影像学中,运行这样的算法的计算负担可能很大。我们发现,通过将置换检验过程视为构建一个非常大的置换检验矩阵 T,可以利用从数据和检验统计量中得出的结构特性,在某些条件下减少运行时间。特别是,我们看到 T 具有低秩加低方差残差的性质。这使得 T 成为低秩矩阵补全的一个很好的候选者,在这种方法中,只需要计算 T 的很少一部分(我们实验中大约 0.35%的所有条目)就可以得到很好的估计。基于这一观察,我们提出了 RapidPT,这是一种有效地恢复在体素分析中通常通过常规置换检验获得的最大零分布的算法。我们在一个合成数据集和四个不同大小的数据集上进行了广泛的验证,对比了两个基线:统计非参数映射(SnPM13)和标准置换检验实现(称为 NaivePT)。我们发现,RapidPT 在中等大小的数据集(50≤n≤200)上实现了最佳的运行时性能,与 SnPM13 相比,速度提高了 1.5×-38×,与 NaivePT 相比,速度提高了 20x-1000×。对于更大的数据集(n≥200),RapidPT 在所有数据集上都优于 NaivePT(6×-200×),并且在需要超过 10000 次置换(2×-15×)时,相对于 SnPM13 提供了很大的速度提升。该实现是一个独立的工具箱,也可以集成在 SnPM13 中,当可用时可以利用多核架构。