Lapierre Marguerite, Blin Camille, Lambert Amaury, Achaz Guillaume, Rocha Eduardo P C
Atelier de Bioinformatique, UMR7205 ISYEB, MNHN-UPMC-CNRS-EPHE, Muséum National d'Histoire Naturelle, Paris, France Collège de France, Center for Interdisciplinary Research in Biology (CIRB), CNRS UMR 7241, Paris, France.
Sorbonne Universités, UPMC Univ Paris06, IFD, 4 Place Jussieu, Paris Cedex05, France Institut Pasteur, Microbial Evolutionary Genomics, Paris, France CNRS, UMR3525, Paris, France.
Mol Biol Evol. 2016 Jul;33(7):1711-25. doi: 10.1093/molbev/msw048. Epub 2016 Mar 1.
Recent studies have linked demographic changes and epidemiological patterns in bacterial populations using coalescent-based approaches. We identified 26 studies using skyline plots and found that 21 inferred overall population expansion. This surprising result led us to analyze the impact of natural selection, recombination (gene conversion), and sampling biases on demographic inference using skyline plots and site frequency spectra (SFS). Forward simulations based on biologically relevant parameters from Escherichia coli populations showed that theoretical arguments on the detrimental impact of recombination and especially natural selection on the reconstructed genealogies cannot be ignored in practice. In fact, both processes systematically lead to spurious interpretations of population expansion in skyline plots (and in SFS for selection). Weak purifying selection, and especially positive selection, had important effects on skyline plots, showing patterns akin to those of population expansions. State-of-the-art techniques to remove recombination further amplified these biases. We simulated three common sampling biases in microbiological research: uniform, clustered, and mixed sampling. Alone, or together with recombination and selection, they further mislead demographic inferences producing almost any possible skyline shape or SFS. Interestingly, sampling sub-populations also affected skyline plots and SFS, because the coalescent rates of populations and their sub-populations had different distributions. This study suggests that extreme caution is needed to infer demographic changes solely based on reconstructed genealogies. We suggest that the development of novel sampling strategies and the joint analyzes of diverse population genetic methods are strictly necessary to estimate demographic changes in populations where selection, recombination, and biased sampling are present.
最近的研究使用基于溯祖理论的方法,将细菌种群中的人口结构变化与流行病学模式联系起来。我们通过天际线图识别出26项研究,发现其中21项推断总体种群扩张。这一惊人结果促使我们使用天际线图和位点频率谱(SFS)来分析自然选择、重组(基因转换)和抽样偏差对种群动态推断的影响。基于大肠杆菌种群生物学相关参数的正向模拟表明,关于重组尤其是自然选择对重建谱系的不利影响的理论观点在实际中不可忽视。事实上,这两个过程都会系统性地导致对天际线图(以及选择情况下的SFS)中种群扩张的错误解读。微弱的净化选择,尤其是正向选择,对天际线图有重要影响,呈现出类似于种群扩张的模式。去除重组的先进技术进一步放大了这些偏差。我们模拟了微生物研究中三种常见的抽样偏差:均匀抽样、聚类抽样和混合抽样。单独来看,或者与重组和选择一起,它们会进一步误导种群动态推断,产生几乎任何可能的天际线形状或SFS。有趣的是,抽样亚种群也会影响天际线图和SFS,因为种群及其亚种群的溯祖率分布不同。这项研究表明,仅基于重建谱系推断种群动态变化时需要格外谨慎。我们建议,开发新的抽样策略以及联合分析多种群体遗传方法对于估计存在选择、重组和抽样偏差的种群中的种群动态变化是绝对必要的。