Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, USA.
Department of Biology, Stanford University, Stanford, CA 94305, USA.
Stat Appl Genet Mol Biol. 2023 Dec 12;22(1). doi: 10.1515/sagmb-2023-0004. eCollection 2023 Jan 1.
Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically. Examining two formulations of allele-sharing dissimilarity, we obtain the distributions of within-population and between-population dissimilarities for pairs of individuals. We then mathematically explore the scenarios in which, for certain allele-frequency distributions, the within-population dissimilarity - the mean dissimilarity between randomly chosen members of a population - can exceed the dissimilarity between two populations. Such scenarios assist in explaining observations in population-genetic data that members of a population can be empirically more genetically dissimilar from each other on average than they are from members of another population. For a population pair, however, the mathematical analysis finds that at least one of the two populations always possesses smaller within-population dissimilarity than the value of the between-population dissimilarity. We illustrate the mathematical results with an application to human population-genetic data.
遗传基因座的等位基因共享统计数据,衡量了两个群体之间的差异,其平均值为每个群体中随机抽取的个体之间的差异。由于基因型在群体内存在差异,等位基因共享差异可能具有这样的性质:当在群体自身之间进行计算时,它们具有非零值。我们考虑了在两个群体中,等位基因共享差异的数学性质,将两个群体中的等位基因频率作为参数进行处理。通过研究两种等位基因共享差异的表述形式,我们得到了个体对之间的群体内和群体间差异的分布。然后,我们从数学角度探讨了在某些等位基因频率分布的情况下,群体内差异(即从群体中随机选择的成员之间的平均差异)如何超过两个群体之间的差异。这种情况有助于解释群体遗传数据中的观察结果,即群体中的成员彼此之间在遗传上的平均差异可能大于他们与另一个群体成员之间的差异。然而,对于一个群体对,数学分析发现,至少有一个群体的群体内差异总是小于群体间差异的值。我们通过对人类群体遗传数据的应用来说明数学结果。