Infectious Diseases Programme, Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore & Infectious Diseases Group, Genome Institute of Singapore, Singapore 138672
Centrum Wiskunde & Informatica (CWI), Science Park 123, 1098 XG Amsterdam, The Netherlands.
G3 (Bethesda). 2020 Nov 5;10(11):3959-3967. doi: 10.1534/g3.120.401575.
Ewen's sampling formula is a foundational theoretical result that connects probability and number theory with molecular genetics and molecular evolution; it was the analytical result required for testing the neutral theory of evolution, and has since been directly or indirectly utilized in a number of population genetics statistics. Ewen's sampling formula, in turn, is deeply connected to Stirling numbers of the first kind. Here, we explore the cumulative distribution function of these Stirling numbers, which enables a single direct estimate of the sum, using representations in terms of the incomplete beta function. This estimator enables an improved method for calculating an asymptotic estimate for one useful statistic, Fu's [Formula: see text] By reducing the calculation from a sum of terms involving Stirling numbers to a single estimate, we simultaneously improve accuracy and dramatically increase speed.
尤因抽样公式是一个基础性的理论结果,它将概率论和数论与分子遗传学和分子进化联系起来;它是检验进化中性理论所必需的分析结果,此后直接或间接地被应用于许多群体遗传学统计中。尤因抽样公式反过来又与第一类斯特林数有着深刻的联系。在这里,我们探索了这些斯特林数的累积分布函数,它使得使用不完全贝塔函数的表示可以直接对和进行单一估计。该估计器为计算一个有用的统计量 Fu 的[公式:见正文]的渐近估计提供了一种改进的方法。通过将计算从涉及斯特林数的项的和减少到单个估计,我们同时提高了准确性,并大大提高了速度。