Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
Neuroimage. 2011 May 15;56(2):497-507. doi: 10.1016/j.neuroimage.2010.07.074. Epub 2010 Aug 13.
The relationship between spatially distributed fMRI patterns and experimental stimuli or tasks offers insights into cognitive processes beyond those traceable from individual local activations. The multivariate properties of the fMRI signals allow us to infer interactions among individual regions and to detect distributed activations of multiple areas. Detection of task-specific multivariate activity in fMRI data is an important open problem that has drawn much interest recently. In this paper, we study and demonstrate the benefits of random forest classifiers and the associated Gini importance measure for selecting voxel subsets that form a multivariate neural response. The Gini importance measure quantifies the predictive power of a particular feature when considered as part of the entire pattern. The measure is based on a random sampling of fMRI time points and voxels. As a consequence the resulting voxel score, or Gini contrast, is highly reproducible and reliably includes all informative features. The method does not rely on a priori assumptions about the signal distribution, a specific statistical or functional model or regularization. Instead, it uses the predictive power of features to characterize their relevance for encoding task information. The Gini contrast offers an additional advantage of directly quantifying the task-relevant information in a multiclass setting, rather than reducing the problem to several binary classification subproblems. In a multicategory visual fMRI study, the proposed method identified informative regions not detected by the univariate criteria, such as the t-test or the F-test. Including these additional regions in the feature set improves the accuracy of multicategory classification. Moreover, we demonstrate higher classification accuracy and stability of the detected spatial patterns across runs than the traditional methods such as the recursive feature elimination used in conjunction with support vector machines.
fMRI 模式的空间分布与实验刺激或任务之间的关系提供了对认知过程的深入了解,这些认知过程超出了可从个体局部激活中追踪到的范围。fMRI 信号的多元特性允许我们推断个体区域之间的相互作用,并检测多个区域的分布式激活。在 fMRI 数据中检测特定于任务的多元活动是一个重要的开放性问题,最近引起了广泛关注。在本文中,我们研究并展示了随机森林分类器和相关基尼重要性度量的优势,用于选择形成多元神经反应的体素子集。基尼重要性度量量化了特定特征在作为整个模式的一部分时的预测能力。该度量基于 fMRI 时间点和体素的随机抽样。因此,得到的体素得分或基尼对比度高度可重复,并且可靠地包含所有有用的特征。该方法不依赖于关于信号分布、特定统计或功能模型或正则化的先验假设。相反,它使用特征的预测能力来描述其对编码任务信息的相关性。基尼对比度提供了另一个优势,即在多类设置中直接量化与任务相关的信息,而不是将问题简化为几个二进制分类子问题。在多类别视觉 fMRI 研究中,所提出的方法识别了单变量标准(例如 t 检验或 F 检验)未检测到的信息丰富的区域。在特征集中包含这些额外的区域可以提高多类别分类的准确性。此外,我们证明了与传统方法(例如与支持向量机结合使用的递归特征消除)相比,检测到的空间模式在跨运行中的分类准确性和稳定性更高。