Inria, CEA, Université Paris-Saclay, Paris, France; Institut de Mathématiques de Toulouse, UMR 5219, Université de Toulouse, CNRS, UPS, Toulouse, France.
Inria, CEA, Université Paris-Saclay, Paris, France.
Neuroimage. 2022 Oct 15;260:119492. doi: 10.1016/j.neuroimage.2022.119492. Epub 2022 Jul 20.
Cluster-level inference procedures are widely used for brain mapping. These methods compare the size of clusters obtained by thresholding brain maps to an upper bound under the global null hypothesis, computed using Random Field Theory or permutations. However, the guarantees obtained by this type of inference - i.e. at least one voxel is truly activated in the cluster - are not informative with regards to the strength of the signal therein. There is thus a need for methods to assess the amount of signal within clusters; yet such methods have to take into account that clusters are defined based on the data, which creates circularity in the inference scheme. This has motivated the use of post hoc estimates that allow statistically valid estimation of the proportion of activated voxels in clusters. In the context of fMRI data, the All-Resolutions Inference framework introduced in Rosenblatt et al. (2018) provides post hoc estimates of the proportion of activated voxels. However, this method relies on parametric threshold families, which results in conservative inference. In this paper, we leverage randomization methods to adapt to data characteristics and obtain tighter false discovery control. We obtain Notip, for Non-parametric True Discovery Proportion control: a powerful, non-parametric method that yields statistically valid guarantees on the proportion of activated voxels in data-derived clusters. Numerical experiments demonstrate substantial gains in number of detections compared with state-of-the-art methods on 36 fMRI datasets. The conditions under which the proposed method brings benefits are also discussed.
集群级别的推断程序在脑图谱研究中得到了广泛应用。这些方法通过将大脑图谱的聚类大小与全局零假设下的上限(使用随机场理论或置换计算)进行比较,从而进行推断。然而,这种推断所提供的保证——即聚类中至少有一个体素被真正激活——对于信号的强度并没有信息价值。因此,需要有方法来评估聚类中的信号量;然而,这类方法必须考虑到聚类是基于数据定义的,这会在推断方案中产生循环性。这促使人们使用事后估计,以统计上有效的方式估计聚类中激活体素的比例。在 fMRI 数据的背景下,Rosenblatt 等人(2018)提出的全分辨率推断框架提供了聚类中激活体素比例的事后估计。然而,这种方法依赖于参数阈值族,导致保守的推断。在本文中,我们利用随机化方法来适应数据特征,并获得更严格的错误发现控制。我们得到了 Notip,用于非参数真实发现比例控制:一种强大的非参数方法,可在数据衍生聚类中的激活体素比例上提供统计上有效的保证。数值实验表明,与 36 个 fMRI 数据集上的最新方法相比,检测到的数量有了实质性的提高。还讨论了提出的方法带来好处的条件。