Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, USA.
Nat Genet. 2012 May 20;44(6):725-31. doi: 10.1038/ng.2285.
Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.
研究人类疾病和进化的广泛应用涉及到对群体内部和群体之间遗传多样性的特征描述。我们提出了一种新的方法,即空间亲缘关系分析,用于在二维或三维空间中对基因型进行建模。在空间亲缘关系分析(SPA)中,我们通过将等位基因频率分配为地理空间中的连续函数,明确地对每个 SNP 的空间分布进行建模。我们表明,等位基因频率的显式建模允许仅根据遗传信息将个体定位在地图上。我们将我们的 SPA 方法应用于欧洲和全球人口遗传变异数据集,并确定了显示等位基因频率大梯度的 SNP,我们认为这些 SNP 是选择作用下的候选区域。这些区域包括在 well-characterized LCT 区域中的 SNP,以及包括 FOXP2、OCA2 和 LRP1B 在内的基因座中的 SNP。