Steinrücken Matthias, Paul Joshua S, Song Yun S
Department of Statistics, University of California, Berkeley, CA 94720, USA.
Theor Popul Biol. 2013 Aug;87:51-61. doi: 10.1016/j.tpb.2012.08.004. Epub 2012 Sep 7.
Conditional sampling distributions (CSDs), sometimes referred to as copying models, underlie numerous practical tools in population genomic analyses. Though an important application that has received much attention is the inference of population structure, the explicit exchange of migrants at specified rates has not hitherto been incorporated into the CSD in a principled framework. Recently, in the case of a single panmictic population, a sequentially Markov CSD has been developed as an accurate, efficient approximation to a principled CSD derived from the diffusion process dual to the coalescent with recombination. In this paper, the sequentially Markov CSD framework is extended to incorporate subdivided population structure, thus providing an efficiently computable CSD that admits a genealogical interpretation related to the structured coalescent with migration and recombination. As a concrete application, it is demonstrated empirically that the CSD developed here can be employed to yield accurate estimation of a wide range of migration rates.
条件抽样分布(CSDs),有时也被称为复制模型,是群体基因组分析中众多实用工具的基础。尽管群体结构推断这一重要应用受到了广泛关注,但以特定速率进行的移民的明确交换,迄今尚未在一个有原则的框架中纳入条件抽样分布。最近,在单一随机交配群体的情况下,一种顺序马尔可夫条件抽样分布已被开发出来,作为对从与重组的合并对偶的扩散过程推导出来的有原则的条件抽样分布的准确、高效近似。在本文中,顺序马尔可夫条件抽样分布框架被扩展以纳入细分的群体结构,从而提供了一种可有效计算的条件抽样分布,该分布允许与具有迁移和重组的结构化合并相关的谱系解释。作为一个具体应用,通过实证证明,这里开发的条件抽样分布可用于对广泛的迁移率进行准确估计。