Stehr Daniel A, Garcia Javier O, Pyles John A, Grossman Emily D
University of California, Irvine, United States of America.
US DEVCOM Army Research Laboratory, United States of America.
J Neurosci Methods. 2023 Mar 1;387:109808. doi: 10.1016/j.jneumeth.2023.109808. Epub 2023 Feb 2.
Multivariate pattern analysis (MVPA or pattern decoding) has attracted considerable attention as a sensitive analytic tool for investigations using functional magnetic resonance imaging (fMRI) data. With the introduction of MVPA, however, has come a proliferation of methodological choices confronting the researcher, with few studies to date offering guidance from the vantage point of controlled datasets detached from specific experimental hypotheses.
We investigated the impact of four data processing steps on support vector machine (SVM) classification performance aimed at maximizing information capture in the presence of common noise sources. The four techniques included: trial averaging (classifying on separate trial estimates versus condition-based averages), within-run mean centering (centering the data or not), method of cost selection (using a fixed or tuned cost value), and motion-related denoising approach (comparing no denoising versus a variety of nuisance regressions capturing motion-related reference signals). The impact of these approaches was evaluated on real fMRI data from two control ROIs, as well as on simulated pattern data constructed with carefully controlled voxel- and trial-level noise components.
We find significant improvements in classification performance across both real and simulated datasets with run-wise trial averaging and mean centering. When averaging trials within conditions of each run, we note a simultaneous increase in the between-subject variability of SVM classification accuracies which we attribute to the reduced size of the test set used to assess the classifier's prediction error. Therefore, we propose a hybrid technique whereby randomly sampled subsets of trials are averaged per run and demonstrate that it helps mitigate the tradeoff between improving signal-to-noise ratio by averaging and losing exemplars in the test set.
Though a handful of empirical studies have employed run-based trial averaging, mean centering, or their combination, such studies have done so without theoretical justification or rigorous testing using control ROIs.
Therefore, we intend this study to serve as a practical guide for researchers wishing to optimize pattern decoding without risk of introducing spurious results.
多变量模式分析(MVPA或模式解码)作为一种用于功能磁共振成像(fMRI)数据研究的灵敏分析工具,已引起广泛关注。然而,随着MVPA的引入,研究人员面临着大量的方法选择,迄今为止,很少有研究从脱离特定实验假设的受控数据集的角度提供指导。
我们研究了四个数据处理步骤对支持向量机(SVM)分类性能的影响,目的是在存在常见噪声源的情况下最大化信息捕获。这四种技术包括:试验平均(对单独的试验估计值进行分类与基于条件的平均值进行分类)、运行内均值中心化(对数据进行中心化或不进行中心化)、成本选择方法(使用固定或调整后的成本值)以及与运动相关的去噪方法(比较不进行去噪与多种捕获与运动相关参考信号的干扰回归方法)。在来自两个对照感兴趣区域的真实fMRI数据以及由精心控制的体素和试验水平噪声成分构建的模拟模式数据上评估了这些方法的影响。
我们发现,通过运行级试验平均和均值中心化,真实数据集和模拟数据集的分类性能都有显著提高。当在每次运行的条件内对试验进行平均时,我们注意到支持向量机分类准确率的受试者间变异性同时增加,我们将其归因于用于评估分类器预测误差的测试集规模减小。因此,我们提出了一种混合技术,即每次运行对随机采样的试验子集进行平均,并证明它有助于减轻通过平均提高信噪比与在测试集中丢失样本之间的权衡。
尽管少数实证研究采用了基于运行的试验平均、均值中心化或它们的组合,但这些研究这样做时没有理论依据,也没有使用对照感兴趣区域进行严格测试。
因此,我们希望这项研究能为希望优化模式解码而又不会引入虚假结果风险的研究人员提供实用指南。