Nielson Dylan M, Sederberg Per B
Data Science and Sharing Team, National Institute of Mental Health, Bethesda, MD, United States of America.
Department of Psychology, The Ohio State University, Columbus, OH, United States of America.
PLoS One. 2017 Aug 22;12(8):e0182797. doi: 10.1371/journal.pone.0182797. eCollection 2017.
Mixed effects models provide significant advantages in sensitivity and flexibility over typical statistical approaches to neural data analysis, but mass univariate application of mixed effects models to large neural datasets is computationally intensive. Threshold free cluster enhancement also provides a significant increase in sensitivity, but requires computationally-intensive permutation-based significance testing. Not surprisingly, the combination of mixed effects models with threshold free cluster enhancement and nonparametric permutation-based significance testing is currently completely impractical. With mixed effects for large datasets (MELD) we circumvent this impasse by means of a singular value decomposition to reduce the dimensionality of neural data while maximizing signal. Singular value decompositions become unstable when there are large numbers of noise features, so we precede it with a bootstrap-based feature selection step employing threshold free cluster enhancement to identify stable features across subjects. By projecting the dependent data into the reduced space of the singular value decomposition we gain the power of a multivariate approach and we can greatly reduce the number of mixed effects models that need to be run, making it feasible to use permutation testing to determine feature level significance. Due to these innovations, MELD is much faster than an element-wise mixed effects analysis, and on simulated data MELD was more sensitive than standard techniques, such as element-wise t-tests combined with threshold-free cluster enhancement. When evaluated on an EEG dataset, MELD identified more significant features than the t-tests with threshold free cluster enhancement in a comparable amount of time.
混合效应模型在神经数据分析的敏感性和灵活性方面比典型的统计方法具有显著优势,但将混合效应模型大规模单变量应用于大型神经数据集在计算上是密集的。无阈值聚类增强也显著提高了敏感性,但需要基于置换的计算密集型显著性检验。毫不奇怪,将混合效应模型与无阈值聚类增强和基于非参数置换的显著性检验相结合目前完全不切实际。使用大型数据集的混合效应(MELD),我们通过奇异值分解来规避这一僵局,以降低神经数据的维度同时最大化信号。当存在大量噪声特征时,奇异值分解会变得不稳定,因此我们在其之前进行基于自助法的特征选择步骤,采用无阈值聚类增强来识别跨受试者的稳定特征。通过将相关数据投影到奇异值分解的降维空间中,我们获得了多变量方法的功效,并且可以大大减少需要运行的混合效应模型的数量,使得使用置换检验来确定特征水平的显著性变得可行。由于这些创新,MELD比逐个元素的混合效应分析快得多,并且在模拟数据上,MELD比标准技术(如逐个元素的t检验与无阈值聚类增强相结合)更敏感。在一个脑电图数据集上进行评估时,MELD在相当的时间内比具有无阈值聚类增强的t检验识别出更多的显著特征。