Institute of General Microbiology, Kiel University, Kiel, Germany.
Genome Biol Evol. 2023 Jun 1;15(6). doi: 10.1093/gbe/evad096.
The determination of the last common ancestor (LCA) of a group of species plays a vital role in evolutionary theory. Traditionally, an LCA is inferred by the rooting of a fully resolved species tree. From a theoretical perspective, however, inference of the LCA amounts to the reconstruction of just one branch-the root branch-of the true species tree and should therefore be a much easier task than the full resolution of the species tree. Discarding the reliance on a hypothesized species tree and its rooting leads us to reevaluate what phylogenetic signal is directly relevant to LCA inference and to recast the task as that of sampling the total evidence from all gene families at the genomic scope. Here, we reformulate LCA and root inference in the framework of statistical hypothesis testing and outline an analytical procedure to formally test competing a priori LCA hypotheses and to infer confidence sets for the earliest speciation events in the history of a group of species. Applying our methods to two demonstrative data sets, we show that our inference of the opisthokonta LCA is well in agreement with the common knowledge. Inference of the proteobacteria LCA shows that it is most closely related to modern Epsilonproteobacteria, raising the possibility that it may have been characterized by a chemolithoautotrophic and anaerobic life style. Our inference is based on data comprising between 43% (opisthokonta) and 86% (proteobacteria) of all gene families. Approaching LCA inference within a statistical framework renders the phylogenomic inference powerful and robust.
确定一组物种的最近共同祖先(LCA)在进化理论中起着至关重要的作用。传统上,通过完全解析的物种树的根推断 LCA。然而,从理论角度来看,推断 LCA 相当于重建真实物种树的一个分支——根分支,因此应该比完全解析物种树容易得多。摒弃对假设的物种树及其根的依赖,使我们重新评估与 LCA 推断直接相关的系统发育信号,并将任务重新表述为从基因组范围内所有基因家族的总证据中进行采样。在这里,我们在统计假设检验的框架内重新制定 LCA 和根推断,并概述了一种分析程序,用于正式检验竞争的先验 LCA 假设,并推断一组物种历史上最早的物种形成事件的置信集。将我们的方法应用于两个示范数据集,我们表明我们对后生动物 LCA 的推断与常识非常吻合。对变形菌 LCA 的推断表明,它与现代ε变形菌最密切相关,这增加了它可能具有化能自养和厌氧生活方式的可能性。我们的推断基于包含所有基因家族的 43%(后生动物)和 86%(变形菌)之间的数据。在统计框架内进行 LCA 推断可使系统发育推断强大且稳健。