Department of Statistics, University of Wisconsin - Madison, WI, 53706, USA.
Department of Mathematics and Statistics, University of Alaska - Fairbanks, AK, 99775, USA.
Syst Biol. 2023 Nov 1;72(5):1171-1179. doi: 10.1093/sysbio/syad030.
We consider the evolution of phylogenetic gene trees along phylogenetic species networks, according to the network multispecies coalescent process, and introduce a new network coalescent model with correlated inheritance of gene flow. This model generalizes two traditional versions of the network coalescent: with independent or common inheritance. At each reticulation, multiple lineages of a given locus are inherited from parental populations chosen at random, either independently across lineages or with positive correlation according to a Dirichlet process. This process may account for locus-specific probabilities of inheritance, for example. We implemented the simulation of gene trees under these network coalescent models in the Julia package PhyloCoalSimulations, which depends on PhyloNetworks and its powerful network manipulation tools. Input species phylogenies can be read in extended Newick format, either in numbers of generations or in coalescent units. Simulated gene trees can be written in Newick format, and in a way that preserves information about their embedding within the species network. This embedding can be used for downstream purposes, such as to simulate species-specific processes like rate variation across species, or for other scenarios as illustrated in this note. This package should be useful for simulation studies and simulation-based inference methods. The software is available open source with documentation and a tutorial at https://github.com/cecileane/PhyloCoalSimulations.jl.
我们根据网络多物种合并过程,考虑系统发生种系网络中系统发生基因树的进化,并引入了一个具有相关基因流遗传的新网络合并模型。该模型概括了网络合并的两个传统版本:独立或共同遗传。在每个融合处,给定基因座的多个谱系从随机选择的亲本群体中遗传,要么跨谱系独立遗传,要么根据狄利克雷过程呈正相关。这个过程可以解释特定基因座的遗传概率等。我们在 Julia 包 PhyloCoalSimulations 中实现了这些网络合并模型下基因树的模拟,该包依赖于 PhyloNetworks 及其强大的网络操作工具。输入的物种系统发生树可以以扩展的 Newick 格式读取,无论是以世代数还是以合并单位数读取。模拟的基因树可以以 Newick 格式写入,并且以一种保留它们在物种网络中的嵌入信息的方式写入。这种嵌入可以用于下游目的,例如模拟跨物种的速率变化等物种特异性过程,或用于本说明中所示的其他场景。该软件包应该对模拟研究和基于模拟的推断方法有用。该软件可在 https://github.com/cecileane/PhyloCoalSimulations.jl 上以开放源代码的形式获得,带有文档和教程。