He Kevin, Xu Han, Kang Jian
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan.
Department of Statistics, University of Michigan, Ann Arbor, Michigan.
Wiley Interdiscip Rev Comput Stat. 2019 Mar-Apr;11(2). doi: 10.1002/wics.1454. Epub 2018 Sep 21.
In neuroimaging studies, regression models are frequently used to identify the association of the imaging features and clinical outcome, where the number of imaging features (e.g., hundreds of thousands of voxel-level predictors) much outweighs the number of subjects in the studies. Classical best subset selection or penalized variable selection methods that perform well for low- or moderate-dimensional data do not scale to ultrahigh-dimensional neuroimaging data. To reduce the dimensionality, variable screening has emerged as a powerful tool for feature selection in neuroimaging studies. We present a selective review of the recent developments in ultrahigh-dimensional variable screening, with a focus on their practical performance on the analysis of neuroimaging data with complex spatial correlation structures and high-dimensionality. We conduct extensive simulation studies to compare the performance on selection accuracy and computational costs between the different methods. We present analyses of resting-state functional magnetic resonance imaging data in the Autism Brain Imaging Data Exchange study. This article is categorized under: Applications of Computational Statistics > Computational and Molecular BiologyStatistical Learning and Exploratory Methods of the Data Sciences > Image Data MiningStatistical and Graphical Methods of Data Analysis > Analysis of High Dimensional Data.
在神经影像学研究中,回归模型经常被用于确定成像特征与临床结果之间的关联,其中成像特征的数量(例如,数十万个体素级预测变量)远远超过研究中的受试者数量。对于低维或中等维数的数据表现良好的经典最佳子集选择或惩罚变量选择方法,无法扩展应用于超高维神经影像学数据。为了降低维度,变量筛选已成为神经影像学研究中特征选择的有力工具。我们对超高维变量筛选的最新进展进行了选择性综述,重点关注它们在分析具有复杂空间相关结构和高维性的神经影像学数据时的实际性能。我们进行了广泛的模拟研究,以比较不同方法在选择准确性和计算成本方面的性能。我们展示了对自闭症脑成像数据交换研究中的静息态功能磁共振成像数据的分析。本文分类如下:计算统计学应用>计算与分子生物学;数据科学的统计学习与探索方法>图像数据挖掘;数据分析的统计与图形方法>高维数据分析。