Leviyang Sivan
Department of Mathematics, Georgetown University, Washington, DC, USA.
J Math Biol. 2011 Feb;62(2):203-89. doi: 10.1007/s00285-010-0333-0. Epub 2010 Feb 26.
We examine genetic statistics used in the study of structured populations. In a 1999 paper, Wakeley observed that the coalescent process associated with the finite island model can be decomposed into a scattering phase and a collecting phase. This decomposition becomes exact in the large population limit with the coalescent at the end of the scattering phase converging to the Ewens sampling formula and the coalescent during the collecting phase converging to the Kingman coalescent. In this paper we introduce a class of limiting models, which we refer to as G/KC models, that generalize Wakeley's decomposition. G in G/KC represents a completely general limit for the scattering phase, while KC represents a Kingman coalescent limit for the collecting phase. We show that both the island and two-dimensional stepping stone models converge to G/KC models in the large population limit. We then derive the distribution of the statistic F(st) for all G/KC models under a large sample limit for the cases of strong or weak mutation, thereby deriving the large population, large sample limiting distribution of F(st) for the island and two-dimensional stepping stone models as a special case of a general formula. Our methods allow us to take the large population and large sample limits simultaneously. In the context of large population, large sample limits, we show that the variance of F(st) in the presence of weak mutation collapses as O(1/log d) where d is the number of demes sampled. Further, we show that this O(1/log d) is caused by a heavy tail in the distribution of F(st). Our analysis of F(st) can be extended to an entire class of genetic statistics, and we use our approach to examine homozygosity measures. Our analysis uses coalescent based methods.
我们研究了用于结构化种群研究的遗传统计量。在1999年的一篇论文中,韦克利观察到与有限岛屿模型相关的溯祖过程可以分解为一个散射阶段和一个收集阶段。在大种群极限情况下,这种分解变得精确,散射阶段结束时的溯祖过程收敛到尤恩斯抽样公式,而收集阶段的溯祖过程收敛到金曼溯祖。在本文中,我们引入了一类极限模型,我们称之为G/KC模型,它推广了韦克利的分解。G/KC中的G代表散射阶段的一个完全通用的极限,而KC代表收集阶段的金曼溯祖极限。我们表明,在大种群极限情况下岛屿模型和二维 stepping stone模型都收敛到G/KC模型。然后,我们在强突变或弱突变的大样本极限情况下,推导了所有G/KC模型的统计量F(st)的分布,从而作为一个通用公式的特殊情况,推导了岛屿模型和二维 stepping stone模型的大种群、大样本极限分布。我们的方法允许我们同时取大种群和大样本极限。在大种群、大样本极限的背景下,我们表明在存在弱突变时F(st)的方差以O(1/log d)的形式衰减,其中d是抽样的deme数量。此外,我们表明这种O(1/log d)是由F(st)分布中的重尾引起的。我们对F(st)的分析可以扩展到一整类遗传统计量,并且我们使用我们的方法来研究纯合度测量方法。我们的分析使用基于溯祖的方法。