Wang Chaolong, Szpiech Zachary A, Degnan James H, Jakobsson Mattias, Pemberton Trevor J, Hardy John A, Singleton Andrew B, Rosenberg Noah A
University of Michigan, USA.
Stat Appl Genet Mol Biol. 2010;9(1):Article 13. doi: 10.2202/1544-6115.1493. Epub 2010 Jan 27.
Recent applications of principal components analysis (PCA) and multidimensional scaling (MDS) in human population genetics have found that "statistical maps" based on the genotypes in population-genetic samples often resemble geographic maps of the underlying sampling locations. To provide formal tests of these qualitative observations, we describe a Procrustes analysis approach for quantitatively assessing the similarity of population-genetic and geographic maps. We confirm in two scenarios, one using single-nucleotide polymorphism (SNP) data from Europe and one using SNP data worldwide, that a measurably high level of concordance exists between statistical maps of population-genetic variation and geographic maps of sampling locations. Two other examples illustrate the versatility of the Procrustes approach in population-genetic applications, verifying the concordance of SNP analyses using PCA and MDS, and showing that statistical maps of worldwide copy-number variants (CNVs) accord with statistical maps of SNP variation, especially when CNV analysis is limited to samples with the highest-quality data. As statistical maps with PCA and MDS have become increasingly common for use in summarizing population relationships, our examples highlight the potential of Procrustes-based quantitative comparisons for interpreting the results in these maps.
主成分分析(PCA)和多维尺度分析(MDS)在人类群体遗传学中的最新应用发现,基于群体遗传样本中的基因型构建的“统计地图”通常类似于潜在采样地点的地理地图。为了对这些定性观察进行正式检验,我们描述了一种用于定量评估群体遗传地图和地理地图相似性的普氏分析方法。我们在两种情况下得到了证实,一种情况使用来自欧洲的单核苷酸多态性(SNP)数据,另一种情况使用全球范围的SNP数据,即群体遗传变异的统计地图与采样地点的地理地图之间存在显著的高度一致性。另外两个例子说明了普氏方法在群体遗传应用中的通用性,验证了使用PCA和MDS进行SNP分析的一致性,并表明全球范围的拷贝数变异(CNV)统计地图与SNP变异统计地图相符,特别是当CNV分析仅限于具有最高质量数据的样本时。由于使用PCA和MDS构建统计地图在总结群体关系方面越来越普遍,我们的例子突出了基于普氏分析的定量比较在解释这些地图结果方面的潜力。