Guo Ying
Department of Biostatistics and Bioinformatics, Rollins School of Public Health of Emory University, 1518 Clifton RD NE, Atlanta, GA, 30322, USA,
Stat Interface. 2010 Jan 1;3(1):103-112. doi: 10.4310/sii.2010.v3.n1.a9.
Brain imaging data have shown great promise as a useful predictor for psychiatric conditions, cognitive functions and many other neural-related outcomes. Development of prediction models based on imaging data is challenging due to the high dimensionality of the data, noisy measurements, complex correlation structures among voxels, small sample sizes, and between-subject heterogeneity. Most existing prediction approaches apply a dimension reduction method such as PCA on whole brain images as a preprocessing step. These approaches usually do not take into account of the cluster structure among voxels and between-subject differences. We propose a weighted cluster kernel PCA predictive model that addresses the challenges in brain imaging data. We first divide voxels into clusters based on neuroanatomic parcellation or data-driven methods, then extract cluster-specific principal features using kernel PCA and define the prediction model based on the principal features. Finally, we propose a weighted estimation method for the prediction model where each subject is weighted according to the percent of variance explained by the principal features. The proposed method allows assessment of relative importance of various brain regions in prediction; captures nonlinearity in feature space; and helps guard against overfitting for outlying subjects in predictive model building. We evaluate the performance of our method through simulation studies. A real fMRI data example is also used to illustrate the method.
脑成像数据已显示出作为精神疾病、认知功能及许多其他神经相关结果的有用预测指标的巨大潜力。由于数据的高维度、噪声测量、体素间复杂的相关结构、小样本量以及个体间异质性,基于成像数据开发预测模型具有挑战性。大多数现有的预测方法将诸如主成分分析(PCA)之类的降维方法应用于全脑图像作为预处理步骤。这些方法通常没有考虑体素间的聚类结构和个体间差异。我们提出了一种加权聚类核主成分分析预测模型,以应对脑成像数据中的挑战。我们首先基于神经解剖分区或数据驱动方法将体素划分为聚类,然后使用核主成分分析提取特定聚类的主要特征,并基于这些主要特征定义预测模型。最后,我们为预测模型提出一种加权估计方法,其中根据主要特征解释的方差百分比对每个受试者进行加权。所提出的方法允许评估预测中各个脑区的相对重要性;捕捉特征空间中的非线性;并有助于在预测模型构建中防止异常受试者的过拟合。我们通过模拟研究评估了我们方法的性能。还使用了一个实际的功能磁共振成像(fMRI)数据示例来说明该方法。