Grusea Simona, Rodríguez Willy, Pinchon Didier, Chikhi Lounès, Boitard Simon, Mazet Olivier
Institut de Mathématiques de Toulouse, Université de Toulouse, Institut National des Sciences Appliquées, 31077, Toulouse, France.
Laboratoire Évolution et Diversité Biologique (EDB UMR 5174), Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPS, 118 route de Narbonne, Bât. 4R1, 31062, Toulouse Cedex 9, France.
J Math Biol. 2019 Jan;78(1-2):189-224. doi: 10.1007/s00285-018-1272-4. Epub 2018 Jul 20.
The increasing amount of genomic data currently available is expanding the horizons of population genetics inference. A wide range of methods have been published allowing to detect and date major changes in population size during the history of species. At the same time, there has been an increasing recognition that population structure can generate genetic data similar to those generated under models of population size change. Recently, Mazet et al. (Heredity 116(4):362-371, 2016) introduced the idea that, for any model of population structure, it is always possible to find a panmictic model with a particular function of population size-change having an identical distribution of [Formula: see text] (the time of the first coalescence for a sample of size two). This implies that there is an identifiability problem between a panmictic and a structured model when we base our analysis only on [Formula: see text]. In this paper, based on an analytical study of the rate matrix of the ancestral lineage process, we obtain new theoretical results about the joint distribution of the coalescence times [Formula: see text] for a sample of three haploid genes in a n-island model with constant size. Even if, for any [Formula: see text], it is always possible to find a size-change scenario for a panmictic population such that the marginal distribution of [Formula: see text] is exactly the same as in a n-island model with constant population size, we show that the joint distribution of the coalescence times [Formula: see text] for a sample of three genes contains enough information to distinguish between a panmictic population and a n-island model of constant size.
目前可用的基因组数据量不断增加,正在拓展群体遗传学推断的视野。已经发表了各种各样的方法,用于检测和确定物种历史上群体大小的主要变化及其发生时间。与此同时,人们越来越认识到群体结构能够产生与群体大小变化模型所产生的类似的遗传数据。最近,马泽特等人(《遗传》,2016年第116卷第4期:362 - 371页)提出了这样一种观点,即对于任何群体结构模型,总是有可能找到一个具有特定群体大小变化函数的随机交配模型,其具有相同的[公式:见原文]分布(对于大小为二的样本的首次合并时间)。这意味着当我们仅基于[公式:见原文]进行分析时,随机交配模型和结构化模型之间存在可识别性问题。在本文中,基于对祖先谱系过程速率矩阵的分析研究,我们获得了关于在大小恒定的n岛模型中三个单倍体基因样本的合并时间[公式:见原文]联合分布的新理论结果。即使对于任何[公式:见原文],总是有可能找到一个随机交配群体的大小变化情形,使得[公式:见原文]的边际分布与大小恒定的n岛模型中的完全相同,但我们表明三个基因样本的合并时间[公式:见原文]联合分布包含足够的信息来区分随机交配群体和大小恒定的n岛模型。