Department of Biological Physics, Eötvös Loránd University, Budapest, Hungary.
Centre for Biological Diversity, University of St Andrews, St Andrews, United Kingdom.
Mol Biol Evol. 2019 Jun 1;36(6):1294-1301. doi: 10.1093/molbev/msz043.
Molecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.
分子系统发生学长期以来忽视了现有和祖先群体中的多态性。最近,基于多物种合并的方法越来越受欢迎,然而,它们的应用仅限于少数物种和个体。我们引入了一种具有多态性意识的系统发生模型(PoMo),该模型克服了这一限制,能够很好地适应不断增加的序列数据量,同时考虑到现有和祖先的多态性。PoMo 避免了处理基因树,并直接从等位基因频率数据推断物种树。在这里,我们扩展了 IQ-TREE 中的 PoMo 实现,并集成了对统计上最佳突变模型的搜索、跨位点推断突变率变化的能力以及分支支持值的评估。我们以每个 10 个单体型的 100 个物种为例进行了分析,表明 PoMo 可以对大型数据集进行推断。虽然 PoMo 比应用于串联比对的标准替代模型更准确,但它的速度几乎一样快。我们还提供了 bmm-simulate,这是一个软件包,允许在 PoMo 下模拟序列进化。新的选择巩固了 PoMo 在具有群体数据的系统发生分析中的价值。