Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
Center for Plant Systems Biology, VIB, Ghent, Belgium.
Mol Biol Evol. 2019 Jul 1;36(7):1384-1404. doi: 10.1093/molbev/msz088.
Gene tree-species tree reconciliation methods have been employed for studying ancient whole-genome duplication (WGD) events across the eukaryotic tree of life. Most approaches have relied on using maximum likelihood trees and the maximum parsimony reconciliation thereof to count duplication events on specific branches of interest in a reference species tree. Such approaches do not account for uncertainty in the gene tree and reconciliation, or do so only heuristically. The effects of these simplifications on the inference of ancient WGDs are unclear. In particular, the effects of variation in gene duplication and loss rates across the species tree have not been considered. Here, we developed a full probabilistic approach for phylogenomic reconciliation-based WGD inference, accounting for both gene tree and reconciliation uncertainty using a method based on the principle of amalgamated likelihood estimation. The model and methods are implemented in a maximum likelihood and Bayesian setting and account for variation of duplication and loss rates across the species tree, using methods inspired by phylogenetic divergence time estimation. We applied our newly developed framework to ancient WGDs in land plants and investigated the effects of duplication and loss rate variation on reconciliation and gene count based assessment of these earlier proposed WGDs.
基因树-种系发生树协调方法已被用于研究真核生物树中的古老全基因组复制 (WGD) 事件。大多数方法都依赖于使用最大似然树及其最大简约协调来计算参考种系发生树上特定感兴趣分支上的重复事件。这些方法没有考虑基因树和协调的不确定性,或者只是启发式地考虑。这些简化对古代 WGD 推断的影响尚不清楚。特别是,尚未考虑种系发生树上基因复制和丢失率的变化的影响。在这里,我们开发了一种基于系统发生协调的全概率 WGD 推断方法,该方法使用基于合并似然估计原理的方法来同时考虑基因树和协调的不确定性。该模型和方法在最大似然和贝叶斯设置中实现,并考虑了种系发生树上重复和丢失率的变化,使用受系统发生分歧时间估计启发的方法。我们将新开发的框架应用于陆地植物中的古老 WGD,并研究了重复和丢失率变化对协调和基于基因计数的这些早期提出的 WGD 评估的影响。