Department of Computer Science, Rice University, Houston, TX 77005, United States.
Department of BioSciences, Rice University, Houston, TX 77005, United States.
Bioinformatics. 2024 Sep 1;40(Suppl 2):ii20-ii28. doi: 10.1093/bioinformatics/btae390.
Despite the widespread occurrence of polyploids across the Tree of Life, especially in the plant kingdom, very few computational methods have been developed to handle the specific complexities introduced by polyploids in phylogeny estimation. Furthermore, methods that are designed to account for polyploidy often disregard incomplete lineage sorting (ILS), a major source of heterogeneous gene histories, or are computationally very demanding. Therefore, there is a great need for efficient and robust methods to accurately reconstruct polyploid phylogenies.
We introduce Polyphest (POLYploid PHylogeny ESTimation), a new method for efficiently and accurately inferring species phylogenies in the presence of both polyploidy and ILS. Polyphest bypasses the need for extensive network space searches by first generating a multilabeled tree based on gene trees, which is then converted into a (uniquely labeled) species phylogeny. We compare the performance of Polyphest to that of two polyploid phylogeny estimation methods, one of which does not account for ILS, namely PADRE, and another that accounts for ILS, namely MPAllopp. Polyphest is more accurate than PADRE and achieves comparable accuracy to MPAllopp, while being significantly faster. We also demonstrate the application of Polyphest to empirical data from the hexaploid bread wheat and confirm the allopolyploid origin of bread wheat along with the closest relatives for each of its subgenomes.
Polyphest is available at https://github.com/NakhlehLab/Polyphest.
尽管多倍体在生命之树中广泛存在,尤其是在植物界,但很少有计算方法能够处理多倍体在系统发育估计中引入的特殊复杂性。此外,旨在考虑多倍体的方法通常忽略不完全谱系分选(ILS),这是基因历史异质性的主要来源,或者计算要求非常高。因此,非常需要高效、稳健的方法来准确重建多倍体系统发育。
我们引入了 Polyphest(多倍体系统发育估计),这是一种在存在多倍体和 ILS 的情况下高效、准确推断物种系统发育的新方法。Polyphest 通过首先基于基因树生成一个多标签树来绕过对广泛网络空间搜索的需求,然后将其转换为(唯一标记的)物种系统发育。我们比较了 Polyphest 与两种多倍体系统发育估计方法的性能,其中一种方法不考虑 ILS,即 PADRE,另一种方法考虑 ILS,即 MPAllopp。Polyphest 比 PADRE 更准确,与 MPAllopp 达到可比的准确性,而速度明显更快。我们还展示了 Polyphest 在六倍体面包小麦的实证数据中的应用,并确认了面包小麦的异源多倍体起源以及其每个亚基因组的最接近亲缘关系。
Polyphest 可在 https://github.com/NakhlehLab/Polyphest 上获得。