University of Liège (Belgium).
Max Planck Institute of Psychiatry (Germany).
Brief Bioinform. 2019 Nov 27;20(6):2200-2216. doi: 10.1093/bib/bby081.
Principal components (PCs) are widely used in statistics and refer to a relatively small number of uncorrelated variables derived from an initial pool of variables, while explaining as much of the total variance as possible. Also in statistical genetics, principal component analysis (PCA) is a popular technique. To achieve optimal results, a thorough understanding about the different implementations of PCA is required and their impact on study results, compared to alternative approaches. In this review, we focus on the possibilities, limitations and role of PCs in ancestry prediction, genome-wide association studies, rare variants analyses, imputation strategies, meta-analysis and epistasis detection. We also describe several variations of classic PCA that deserve increased attention in statistical genetics applications.
主成分(PCs)在统计学中被广泛应用,指的是从初始变量集中提取的相对少量的不相关变量,同时尽可能多地解释总方差。在统计遗传学中,主成分分析(PCA)也是一种流行的技术。为了获得最佳结果,需要深入了解 PCA 的不同实现方式及其对研究结果的影响,以及与替代方法的比较。在这篇综述中,我们重点讨论了 PC 在祖先预测、全基因组关联研究、稀有变异分析、插补策略、荟萃分析和上位性检测中的可能性、局限性和作用。我们还描述了经典 PCA 的几种变体,这些变体在统计遗传学应用中值得更多关注。