Allen Benjamin, McAvoy Alex
Department of Mathematics, Emmanuel College, 400 The Fenway, Boston, MA, 02115, USA.
School of Data Science and Society, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA; Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
Theor Popul Biol. 2024 Aug;158:150-169. doi: 10.1016/j.tpb.2024.06.004. Epub 2024 Jun 14.
The coalescent is a stochastic process representing ancestral lineages in a population undergoing neutral genetic drift. Originally defined for a well-mixed population, the coalescent has been adapted in various ways to accommodate spatial, age, and class structure, along with other features of real-world populations. To further extend the range of population structures to which coalescent theory applies, we formulate a coalescent process for a broad class of neutral drift models with arbitrary - but fixed - spatial, age, sex, and class structure, haploid or diploid genetics, and any fixed mating pattern. Here, the coalescent is represented as a random sequence of mappings [Formula: see text] from a finite set G to itself. The set G represents the "sites" (in individuals, in particular locations and/or classes) at which these alleles can live. The state of the coalescent, C:G→G, maps each site g∈G to the site containing g's ancestor, t time-steps into the past. Using this representation, we define and analyze coalescence time, coalescence branch length, mutations prior to coalescence, and stationary probabilities of identity-by-descent and identity-by-state. For low mutation, we provide a recipe for computing identity-by-descent and identity-by-state probabilities via the coalescent. Applying our results to a diploid population with arbitrary sex ratio r, we find that measures of genetic dissimilarity, among any set of sites, are scaled by 4r(1-r) relative to the even sex ratio case.
溯祖过程是一种随机过程,代表了经历中性遗传漂变的种群中的祖先谱系。溯祖过程最初是为一个充分混合的种群定义的,后来已通过各种方式进行调整,以适应空间、年龄和阶层结构以及现实世界种群的其他特征。为了进一步扩展溯祖理论适用的种群结构范围,我们为一类广泛的中性漂变模型制定了一个溯祖过程,这些模型具有任意(但固定)的空间、年龄、性别和阶层结构、单倍体或二倍体遗传学以及任何固定的交配模式。在这里,溯祖过程由从有限集G到其自身的映射[公式:见正文]的随机序列表示。集合G代表这些等位基因可以存在的“位点”(在个体中,特别是在特定位置和/或阶层中)。溯祖过程的状态C:G→G,将每个位点g∈G映射到包含g的祖先的位点,该祖先可追溯到t个时间步之前。使用这种表示法,我们定义并分析了溯祖时间、溯祖分支长度、溯祖前的突变以及基于系谱同一性和状态同一性的平稳概率。对于低突变情况,我们提供了一种通过溯祖过程计算基于系谱同一性和状态同一性概率的方法。将我们的结果应用于具有任意性别比r 的二倍体种群,我们发现,相对于性别比均衡的情况,任何一组位点之间的遗传差异度量都按4r(1 - r)进行缩放。