Mumford Jeanette A
Center for Healthy Minds, University of Wisconsin, Madison, United States.
Neuroimage. 2017 Feb 15;147:658-668. doi: 10.1016/j.neuroimage.2016.12.058. Epub 2016 Dec 25.
Even after thorough preprocessing and a careful time series analysis of functional magnetic resonance imaging (fMRI) data, artifact and other issues can lead to violations of the assumption that the variance is constant across subjects in the group level model. This is especially concerning when modeling a continuous covariate at the group level, as the slope is easily biased by outliers. Various models have been proposed to deal with outliers including models that use the first level variance or that use the group level residual magnitude to differentially weight subjects. The most typically used robust regression, implementing a robust estimator of the regression slope, has been previously studied in the context of fMRI studies and was found to perform well in some scenarios, but a loss of Type I error control can occur for some outlier settings. A second type of robust regression using a heteroscedastic autocorrelation consistent (HAC) estimator, which produces robust slope and variance estimates has been shown to perform well, with better Type I error control, but with large sample sizes (500-1000 subjects). The Type I error control with smaller sample sizes has not been studied in this model and has not been compared to other modeling approaches that handle outliers such as FSL's Flame 1 and FSL's outlier de-weighting. Focusing on group level inference with a continuous covariate over a range of sample sizes and degree of heteroscedasticity, which can be driven either by the within- or between-subject variability, both styles of robust regression are compared to ordinary least squares (OLS), FSL's Flame 1, Flame 1 with outlier de-weighting algorithm and Kendall's Tau. Additionally, subject omission using the Cook's Distance measure with OLS and nonparametric inference with the OLS statistic are studied. Pros and cons of these models as well as general strategies for detecting outliers in data and taking precaution to avoid inflated Type I error rates are discussed.
即使对功能磁共振成像(fMRI)数据进行了全面的预处理和仔细的时间序列分析,伪影和其他问题仍可能导致违反组水平模型中各受试者方差恒定的假设。在组水平对连续协变量进行建模时,这一问题尤为突出,因为斜率很容易受到异常值的影响。已经提出了各种模型来处理异常值,包括使用一级方差的模型或使用组水平残差大小对受试者进行差异加权的模型。最常用的稳健回归是对回归斜率实施稳健估计,此前已在fMRI研究的背景下进行过研究,发现在某些情况下表现良好,但在某些异常值设置下可能会出现I型错误控制失效的情况。另一种使用异方差自相关一致(HAC)估计器的稳健回归,它能产生稳健的斜率和方差估计,已被证明表现良好,具有更好的I型错误控制,但需要大样本量(500 - 1000名受试者)。该模型在较小样本量下的I型错误控制尚未得到研究,也未与其他处理异常值的建模方法(如FSL的Flame 1和FSL的异常值去加权法)进行比较。针对一系列样本量和异方差程度(可能由受试者内或受试者间变异性驱动)下的连续协变量进行组水平推断,将这两种稳健回归与普通最小二乘法(OLS)、FSL的Flame 1、带有异常值去加权算法的Flame 1以及肯德尔秩相关系数(Kendall's Tau)进行了比较。此外,还研究了使用库克距离(Cook's Distance)度量的OLS受试者剔除方法以及使用OLS统计量的非参数推断方法。讨论了这些模型的优缺点以及在数据中检测异常值并采取预防措施以避免I型错误率虚高的一般策略。