School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA.
Med Image Anal. 2020 Apr;61:101656. doi: 10.1016/j.media.2020.101656. Epub 2020 Jan 23.
Brain imaging genetics becomes an important research topic since it can reveal complex associations between genetic factors and the structures or functions of the human brain. Sparse canonical correlation analysis (SCCA) is a popular bi-multivariate association identification method. To mine the complex genetic basis of brain imaging phenotypes, there arise many SCCA methods with a variety of norms for incorporating different structures of interest. They often use the group lasso penalty, the fused lasso or the graph/network guided fused lasso ones. However, the group lasso methods have limited capability because of the incomplete or unavailable prior knowledge in real applications. The fused lasso and graph/network guided methods are sensitive to the sign of the sample correlation which may be incorrectly estimated. In this paper, we introduce two new penalties to improve the fused lasso and the graph/network guided lasso penalties in structured sparse learning. We impose both penalties to the SCCA model and propose an optimization algorithm to solve it. The proposed SCCA method has a strong upper bound of grouping effects for both positively and negatively highly correlated variables. We show that, on both synthetic and real neuroimaging genetics data, the proposed SCCA method performs better than or equally to the conventional methods using fused lasso or graph/network guided fused lasso. In particular, the proposed method identifies higher canonical correlation coefficients and captures clearer canonical weight patterns, demonstrating its promising capability in revealing biologically meaningful imaging genetic associations.
脑影像遗传学成为一个重要的研究课题,因为它可以揭示遗传因素与人类大脑结构或功能之间的复杂关联。稀疏典型相关分析(SCCA)是一种流行的双变量关联识别方法。为了挖掘脑影像表型的复杂遗传基础,出现了许多具有不同感兴趣结构的各种范数的 SCCA 方法。它们通常使用组套索惩罚、融合套索或图/网络引导融合套索。然而,由于实际应用中缺乏完整或可用的先验知识,组套索方法的能力有限。融合套索和图/网络引导方法对样本相关的符号很敏感,而样本相关的符号可能会被错误估计。在本文中,我们引入了两种新的惩罚方法来改进融合套索和图/网络引导套索在结构化稀疏学习中的惩罚。我们将这两种惩罚应用于 SCCA 模型,并提出了一种优化算法来解决它。所提出的 SCCA 方法对正相关和负相关的高度相关变量都具有较强的分组效果上限。我们表明,在所提出的 SCCA 方法在合成和真实神经影像遗传学数据上的表现优于或等同于使用融合套索或图/网络引导融合套索的传统方法。特别是,所提出的方法识别出更高的典型相关系数,并捕捉到更清晰的典型权重模式,这表明其在揭示有生物学意义的影像遗传关联方面具有很大的潜力。