Tannier Eric, Tricou Théo, Benali Syrine, de Vienne Damien M
Laboratoire de Biométrie et Biologie Évolutive UMR5558, Univ Lyon, Université Lyon 1, CNRS, Villeurbanne F-69622, France.
Inria Lyon Research Center, Villeurbanne F-69622, France.
Mol Biol Evol. 2025 Jun 4;42(6). doi: 10.1093/molbev/msaf128.
When a gene is horizontally transferred (HGT), under the "replacement" model where the transferred gene replaces its homolog in the recipient genome, the corresponding gene phylogeny departs from the species phylogeny by a Subtree Prune and Regraft (SPR) operation: the "recipient branch" is moved from its initial position to attach to the "donor branch". Based on this observation, various methods have used SPRs to infer HGTs. We examine this apparent equivalence in the light of ghost lineages, i.e. related species absent from the phylogeny because they are extinct, unknown, or have not been sampled. In this case, an SPR is not directly interpretable by an HGT from the donor branch, because HGTs can have ghost lineages as donors. A possible and frequent interpretation-that we call "induced HGT"-is that the transferred gene leaves the sampled phylogeny for a ghost lineage at the donor branch and is transferred back from a ghost lineage at the recipient branch. We show by simulations that this interpretation is misleading in a significant number of cases. For instance, if the studied phylogeny represents 1% of all the species susceptible to exchange genetic material with the 100 sampled species, and 11 transfers occurred, then SPRs do not correspond to induced HGTs in around 50% of the cases. This leaves the question of a coherent interpretation of SPR in the presence of ghosts open and applies to a certain extent to other phylogenetic simulation or inference methods of HGT, like reconciliation, or phylogenetic networks.
当一个基因发生水平转移(HGT)时,在“替换”模型下,即转移的基因在受体基因组中取代其同源基因,相应的基因系统发育会通过子树剪接和重新嫁接(SPR)操作偏离物种系统发育:“受体分支”从其初始位置移动到附着在“供体分支”上。基于这一观察结果,各种方法都使用SPR来推断HGT。我们根据幽灵谱系来研究这种明显的等价性,即由于灭绝、未知或未被采样而在系统发育中缺失的相关物种。在这种情况下,SPR不能直接解释为来自供体分支的HGT,因为HGT的供体可能是幽灵谱系。一种可能且常见的解释——我们称之为“诱导HGT”——是转移的基因在供体分支处离开采样的系统发育进入幽灵谱系,并在受体分支处从幽灵谱系转移回来。我们通过模拟表明,这种解释在大量情况下具有误导性。例如,如果所研究的系统发育代表了所有能够与100个采样物种交换遗传物质的物种的1%,并且发生了11次转移,那么在大约50%的情况下,SPR并不对应于诱导HGT。这使得在存在幽灵谱系的情况下对SPR进行连贯解释的问题仍然悬而未决,并且在一定程度上适用于其他HGT的系统发育模拟或推断方法,如和解或系统发育网络。