Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
Mol Biol Evol. 2022 Jan 7;39(1). doi: 10.1093/molbev/msab342.
The structured coalescent allows inferring migration patterns between viral subpopulations from genetic sequence data. However, these analyses typically assume that no genetic recombination process impacted the sequence evolution of pathogens. For segmented viruses, such as influenza, that can undergo reassortment this assumption is broken. Reassortment reshuffles the segments of different parent lineages upon a coinfection event, which means that the shared history of viruses has to be represented by a network instead of a tree. Therefore, full genome analyses of such viruses are complex or even impossible. Although this problem has been addressed for unstructured populations, it is still impossible to account for population structure, such as induced by different host populations, whereas also accounting for reassortment. We address this by extending the structured coalescent to account for reassortment and present a framework for investigating possible ties between reassortment and migration (host jump) events. This method can accurately estimate subpopulation dependent effective populations sizes, reassortment, and migration rates from simulated data. Additionally, we apply the new model to avian influenza A/H5N1 sequences, sampled from two avian host types, Anseriformes and Galliformes. We contrast our results with a structured coalescent without reassortment inference, which assumes independently evolving segments. This reveals that taking into account segment reassortment and using sequencing data from several viral segments for joint phylodynamic inference leads to different estimates for effective population sizes, migration, and clock rates. This new model is implemented as the Structured Coalescent with Reassortment package for BEAST 2.5 and is available at https://github.com/jugne/SCORE.
结构协变允许从遗传序列数据推断病毒亚群之间的迁移模式。然而,这些分析通常假设没有遗传重组过程影响病原体的序列进化。对于可以发生重配的分段病毒,如流感病毒,这种假设就不成立了。重配会在共同感染事件中重新排列不同亲系的片段,这意味着病毒的共同历史必须由网络而不是树来表示。因此,对这类病毒进行全基因组分析是复杂的,甚至是不可能的。尽管这个问题已经针对非结构化群体得到了解决,但仍然不可能考虑到群体结构,如不同宿主群体引起的结构,同时也要考虑到重配。我们通过扩展结构协变来考虑重配,并提出了一个框架来研究重配和迁移(宿主跳跃)事件之间可能存在的联系。该方法可以从模拟数据中准确估计亚群依赖的有效种群大小、重配和迁移率。此外,我们将新模型应用于从两种禽类宿主类型(雁形目和鸡形目)中采样的甲型禽流感 A/H5N1 序列。我们将我们的结果与没有重配推断的结构协变进行对比,后者假设独立进化的片段。这表明,考虑到片段重配并使用来自几个病毒片段的测序数据进行联合系统发育推断,会导致对有效种群大小、迁移和钟速的不同估计。这个新模型是作为 BEAST 2.5 的带有重配的结构协变包实现的,可以在 https://github.com/jugne/SCORE 上找到。