Wellcome Centre for Human Genetics, Oxford, United Kingdom.
Big Data Institute, Oxford, United Kingdom.
PLoS One. 2021 Mar 2;16(3):e0247647. doi: 10.1371/journal.pone.0247647. eCollection 2021.
Demographic events shape a population's genetic diversity, a process described by the coalescent-with-recombination model that relates demography and genetics by an unobserved sequence of genealogies along the genome. As the space of genealogies over genomes is large and complex, inference under this model is challenging. Formulating the coalescent-with-recombination model as a continuous-time and -space Markov jump process, we develop a particle filter for such processes, and use waypoints that under appropriate conditions allow the problem to be reduced to the discrete-time case. To improve inference, we generalise the Auxiliary Particle Filter for discrete-time models, and use Variational Bayes to model the uncertainty in parameter estimates for rare events, avoiding biases seen with Expectation Maximization. Using real and simulated genomes, we show that past population sizes can be accurately inferred over a larger range of epochs than was previously possible, opening the possibility of jointly analyzing multiple genomes under complex demographic models. Code is available at https://github.com/luntergroup/smcsmc.
人口统计学事件塑造了一个群体的遗传多样性,这一过程由合并-重组模型描述,该模型通过基因组上未观察到的基因族谱序列将人口统计学和遗传学联系起来。由于基因组上的基因族谱空间很大且复杂,因此该模型下的推断具有挑战性。我们将合并-重组模型表述为连续时间和空间马尔可夫跳跃过程,为此开发了一种用于此类过程的粒子滤波器,并使用在适当条件下允许将问题简化为离散时间情况的途径点。为了改进推断,我们推广了用于离散时间模型的辅助粒子滤波器,并使用变分贝叶斯来对罕见事件的参数估计中的不确定性进行建模,从而避免了期望最大化中出现的偏差。使用真实和模拟的基因组,我们表明,与以前相比,可以在更大的时期范围内更准确地推断过去的人口规模,从而为在复杂的人口统计学模型下联合分析多个基因组开辟了可能性。代码可在 https://github.com/luntergroup/smcsmc 上获得。