Zhao Yi, Caffo Brian, Luo Xi
Department of Biostatistics and Health Data Science, Indiana University School of Medicine.
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health.
Electron J Stat. 2021;15(2):4192-4235. doi: 10.1214/21-ejs1887. Epub 2021 Sep 14.
This manuscript presents an approach to perform generalized linear regression with multiple high dimensional covariance matrices as the outcome. In many areas of study, such as resting-state functional magnetic resonance imaging (fMRI) studies, this type of regression can be utilized to characterize variation in the covariance matrices across units. Model parameters are estimated by maximizing a likelihood formulation of a generalized linear model, conditioning on a well-conditioned linear shrinkage estimator for multiple covariance matrices, where the shrinkage coefficients are proposed to be shared across matrices. Theoretical studies demonstrate that the proposed covariance matrix estimator is optimal achieving the uniformly minimum quadratic loss asymptotically among all linear combinations of the identity matrix and the sample covariance matrix. Under certain regularity conditions, the proposed estimator of the model parameters is consistent. The superior performance of the proposed approach over existing methods is illustrated through simulation studies. Implemented to a resting-state fMRI study acquired from the Alzheimer's Disease Neuroimaging Initiative, the proposed approach identified a brain network within which functional connectivity is significantly associated with Apolipoprotein E 4, a strong genetic marker for Alzheimer's disease.
本手稿提出了一种以多个高维协方差矩阵为结果进行广义线性回归的方法。在许多研究领域,如静息态功能磁共振成像(fMRI)研究中,这种类型的回归可用于刻画各单元协方差矩阵的变化。通过最大化广义线性模型的似然公式来估计模型参数,该公式以多个协方差矩阵的良态线性收缩估计量为条件,其中收缩系数被提议在各矩阵间共享。理论研究表明,所提出的协方差矩阵估计量在单位矩阵和样本协方差矩阵的所有线性组合中渐近地实现了一致最小二次损失,是最优的。在某些正则条件下,所提出的模型参数估计量是一致的。通过模拟研究说明了所提方法相对于现有方法的优越性能。将该方法应用于从阿尔茨海默病神经影像倡议获取的静息态fMRI研究中,所提方法识别出了一个脑网络,其中功能连接与载脂蛋白E4显著相关,载脂蛋白E4是阿尔茨海默病的一个强遗传标记。