Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, PA, 19104, USA.
Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu, 215000, China.
Methods. 2023 Oct;218:27-38. doi: 10.1016/j.ymeth.2023.07.007. Epub 2023 Jul 27.
Investigating the relationship between genetic variation and phenotypic traits is a key issue in quantitative genetics. Specifically for Alzheimer's disease, the association between genetic markers and quantitative traits remains vague while, once identified, will provide valuable guidance for the study and development of genetics-based treatment approaches. Currently, to analyze the association of two modalities, sparse canonical correlation analysis (SCCA) is commonly used to compute one sparse linear combination of the variable features for each modality, giving a pair of linear combination vectors in total that maximizes the cross-correlation between the analyzed modalities. One drawback of the plain SCCA model is that the existing findings and knowledge cannot be integrated into the model as priors to help extract interesting correlations as well as identify biologically meaningful genetic and phenotypic markers. To bridge this gap, we introduce preference matrix guided SCCA (PM-SCCA) that not only takes priors encoded as a preference matrix but also maintains computational simplicity. A simulation study and a real-data experiment are conducted to investigate the effectiveness of the model. Both experiments demonstrate that the proposed PM-SCCA model can capture not only genotype-phenotype correlation but also relevant features effectively.
研究遗传变异与表型特征之间的关系是数量遗传学的一个关键问题。具体针对阿尔茨海默病,遗传标记与定量特征之间的关联仍然模糊不清,一旦确定,将为基于遗传学的治疗方法的研究和开发提供有价值的指导。目前,为了分析两种模态的关联,通常使用稀疏典型相关分析(SCCA)来计算每个模态的变量特征的一个稀疏线性组合,总共给出一对线性组合向量,它们在分析的模态之间最大化互相关。普通 SCCA 模型的一个缺点是,现有的发现和知识不能作为先验集成到模型中,以帮助提取有趣的相关性,并识别有生物学意义的遗传和表型标记。为了弥补这一差距,我们引入了偏好矩阵引导的 SCCA(PM-SCCA),它不仅采用了编码为偏好矩阵的先验,而且保持了计算的简单性。进行了模拟研究和真实数据实验,以调查模型的有效性。两个实验都表明,所提出的 PM-SCCA 模型不仅可以有效地捕捉基因型-表型相关性,还可以捕捉相关特征。