Gao H, Wu Y, Zhang T, Wu Y, Jiang L, Zhan J, Li J, Yang R
Institute of Animal Sciences, Chinese Academy of Agricultural Science, Beijing, People's Republic of China.
Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA.
Heredity (Edinb). 2014 Dec;113(6):526-32. doi: 10.1038/hdy.2014.57. Epub 2014 Jul 2.
Given the drawbacks of implementing multivariate analysis for mapping multiple traits in genome-wide association study (GWAS), principal component analysis (PCA) has been widely used to generate independent 'super traits' from the original multivariate phenotypic traits for the univariate analysis. However, parameter estimates in this framework may not be the same as those from the joint analysis of all traits, leading to spurious linkage results. In this paper, we propose to perform the PCA for residual covariance matrix instead of the phenotypical covariance matrix, based on which multiple traits are transformed to a group of pseudo principal components. The PCA for residual covariance matrix allows analyzing each pseudo principal component separately. In addition, all parameter estimates are equivalent to those obtained from the joint multivariate analysis under a linear transformation. However, a fast least absolute shrinkage and selection operator (LASSO) for estimating the sparse oversaturated genetic model greatly reduces the computational costs of this procedure. Extensive simulations show statistical and computational efficiencies of the proposed method. We illustrate this method in a GWAS for 20 slaughtering traits and meat quality traits in beef cattle.
鉴于在全基因组关联研究(GWAS)中对多个性状进行映射时实施多变量分析存在缺点,主成分分析(PCA)已被广泛用于从原始多变量表型性状中生成独立的“超级性状”以进行单变量分析。然而,该框架中的参数估计可能与所有性状联合分析得到的参数估计不同,从而导致虚假的连锁结果。在本文中,我们建议对残差协方差矩阵而非表型协方差矩阵进行主成分分析,在此基础上,多个性状被转换为一组伪主成分。对残差协方差矩阵进行主成分分析允许分别分析每个伪主成分。此外,在线性变换下,所有参数估计都等同于从联合多变量分析中获得的参数估计。然而,用于估计稀疏过饱和遗传模型的快速最小绝对收缩和选择算子(LASSO)极大地降低了该过程的计算成本。大量模拟显示了所提方法的统计和计算效率。我们在一项针对肉牛20个屠宰性状和肉质性状的全基因组关联研究中阐述了该方法。