Winkler Anderson M, Ridgway Gerard R, Douaud Gwenaëlle, Nichols Thomas E, Smith Stephen M
Oxford Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK.
Oxford Centre for Functional MRI of the Brain, University of Oxford, Oxford, UK; Wellcome Trust Centre for Neuroimaging, UCL Institute of Neurology, London, UK.
Neuroimage. 2016 Nov 1;141:502-516. doi: 10.1016/j.neuroimage.2016.05.068. Epub 2016 Jun 7.
Permutation tests are increasingly being used as a reliable method for inference in neuroimaging analysis. However, they are computationally intensive. For small, non-imaging datasets, recomputing a model thousands of times is seldom a problem, but for large, complex models this can be prohibitively slow, even with the availability of inexpensive computing power. Here we exploit properties of statistics used with the general linear model (GLM) and their distributions to obtain accelerations irrespective of generic software or hardware improvements. We compare the following approaches: (i) performing a small number of permutations; (ii) estimating the p-value as a parameter of a negative binomial distribution; (iii) fitting a generalised Pareto distribution to the tail of the permutation distribution; (iv) computing p-values based on the expected moments of the permutation distribution, approximated from a gamma distribution; (v) direct fitting of a gamma distribution to the empirical permutation distribution; and (vi) permuting a reduced number of voxels, with completion of the remainder using low rank matrix theory. Using synthetic data we assessed the different methods in terms of their error rates, power, agreement with a reference result, and the risk of taking a different decision regarding the rejection of the null hypotheses (known as the resampling risk). We also conducted a re-analysis of a voxel-based morphometry study as a real-data example. All methods yielded exact error rates. Likewise, power was similar across methods. Resampling risk was higher for methods (i), (iii) and (v). For comparable resampling risks, the method in which no permutations are done (iv) was the absolute fastest. All methods produced visually similar maps for the real data, with stronger effects being detected in the family-wise error rate corrected maps by (iii) and (v), and generally similar to the results seen in the reference set. Overall, for uncorrected p-values, method (iv) was found the best as long as symmetric errors can be assumed. In all other settings, including for familywise error corrected p-values, we recommend the tail approximation (iii). The methods considered are freely available in the tool PALM - Permutation Analysis of Linear Models.
排列检验在神经影像分析中越来越多地被用作一种可靠的推断方法。然而,它们计算量很大。对于小型非影像数据集,将模型重新计算数千次很少会成为问题,但对于大型复杂模型,即便有廉价的计算能力,这也可能慢得令人望而却步。在此,我们利用与一般线性模型(GLM)一起使用的统计量的性质及其分布来实现加速,而无需考虑通用软件或硬件的改进。我们比较了以下方法:(i)进行少量排列;(ii)将p值估计为负二项分布的一个参数;(iii)将广义帕累托分布拟合到排列分布的尾部;(iv)基于从伽马分布近似得到的排列分布的期望矩来计算p值;(v)将伽马分布直接拟合到经验排列分布;以及(vi)对减少数量的体素进行排列,其余部分使用低秩矩阵理论来完成。我们使用合成数据从错误率、功效、与参考结果的一致性以及在拒绝原假设方面做出不同决策的风险(称为重采样风险)等方面评估了不同方法。我们还对一项基于体素的形态测量学研究进行了重新分析作为实际数据示例。所有方法都产生了精确的错误率。同样,各方法的功效相似。方法(i)、(iii)和(v)的重采样风险较高。对于可比的重采样风险,不进行排列的方法(iv)是绝对最快的。对于实际数据,所有方法生成的地图在视觉上相似,在经过族错误率校正地图中,方法(iii)和(v)检测到的效应更强,并且总体上与参考集中看到的结果相似。总体而言,对于未校正的p值,只要可以假设对称误差,方法(iv)被认为是最好的。在所有其他设置中,包括对于经过族错误校正的p值,我们推荐尾部近似方法(iii)。所考虑的方法在工具PALM - 线性模型的排列分析中可免费获取。