MRC Centre for Global Infectious Disease Analysis, Imperial College London, London W2 1PG, UK.
Department of Zoology, University of Oxford, Oxford OX1 3SY, UK.
Syst Biol. 2021 Dec 16;71(1):121-138. doi: 10.1093/sysbio/syab037.
In Bayesian phylogenetics, the coalescent process provides an informative framework for inferring changes in the effective size of a population from a phylogeny (or tree) of sequences sampled from that population. Popular coalescent inference approaches such as the Bayesian Skyline Plot, Skyride, and Skygrid all model these population size changes with a discontinuous, piecewise-constant function but then apply a smoothing prior to ensure that their posterior population size estimates transition gradually with time. These prior distributions implicitly encode extra population size information that is not available from the observed coalescent data or tree. Here, we present a novel statistic, $\Omega$, to quantify and disaggregate the relative contributions of the coalescent data and prior assumptions to the resulting posterior estimate precision. Our statistic also measures the additional mutual information introduced by such priors. Using $\Omega$ we show that, because it is surprisingly easy to overparametrize piecewise-constant population models, common smoothing priors can lead to overconfident and potentially misleading inference, even under robust experimental designs. We propose $\Omega$ as a useful tool for detecting when effective population size estimates are overly reliant on prior assumptions and for improving quantification of the uncertainty in those estimates.[Coalescent processes; effective population size; information theory; phylodynamics; prior assumptions; skyline plots.].
在贝叶斯系统发育学中,合并过程为从该群体中采样的序列的系统发育(或树)推断群体有效大小的变化提供了一个信息丰富的框架。流行的合并推断方法,如贝叶斯天际图、Skyride 和 Skygrid,都使用不连续的分段常数函数对这些种群大小变化进行建模,但随后应用平滑先验以确保其后验种群大小估计随时间逐渐过渡。这些先验分布隐式地编码了来自观察到的合并数据或树中不可用的额外种群大小信息。在这里,我们提出了一个新的统计量 $\Omega$,用于量化和分解合并数据和先验假设对最终后验估计精度的相对贡献。我们的统计量还衡量了此类先验带来的额外互信息。使用 $\Omega$,我们表明,由于非常容易过度参数化分段常数种群模型,常见的平滑先验即使在稳健的实验设计下也可能导致过度自信和潜在误导的推断。我们提出 $\Omega$ 作为一种有用的工具,用于检测有效种群大小估计值是否过度依赖于先验假设,并改进对这些估计值的不确定性的量化。[合并过程;有效种群大小;信息论;系统发生动力学;先验假设;天际图。]