Laboratoire de Biométrie et Biologie Évolutive UMR5558, Univ Lyon, Université Lyon 1, CNRS, F-69622 Villeurbanne, France.
Inria, Centre de Recherche de Lyon, F-69603 Villeurbanne, France.
Syst Biol. 2022 Aug 10;71(5):1147-1158. doi: 10.1093/sysbio/syac011.
Most species are extinct, those that are not are often unknown. Sequenced and sampled species are often a minority of known ones. Past evolutionary events involving horizontal gene flow, such as horizontal gene transfer, hybridization, introgression, and admixture, are therefore likely to involve "ghosts," that is extinct, unknown, or unsampled lineages. The existence of these ghost lineages is widely acknowledged, but their possible impact on the detection of gene flow and on the identification of the species involved is largely overlooked. It is generally considered as a possible source of error that, with reasonable approximation, can be ignored. We explore the possible influence of absent species on an evolutionary study by quantifying the effect of ghost lineages on introgression as detected by the popular D-statistic method. We show from simulated data that under certain frequently encountered conditions, the donors and recipients of horizontal gene flow can be wrongly identified if ghost lineages are not taken into account. In particular, having a distant outgroup, which is usually recommended, leads to an increase in the error probability and to false interpretations in most cases. We conclude that introgression from ghost lineages should be systematically considered as an alternative possible, even probable, scenario. [ABBA-BABA; D-statistic; gene flow; ghost lineage; introgression; simulation.].
大多数物种已经灭绝,那些没有灭绝的物种通常也不为人知。经过测序和采样的物种往往只是已知物种中的少数。过去涉及水平基因流的进化事件,如水平基因转移、杂交、渐渗和混合,很可能涉及“幽灵”,即已灭绝、未知或未采样的谱系。这些幽灵谱系的存在是被广泛承认的,但它们对基因流动检测以及涉及物种的鉴定的可能影响在很大程度上被忽视了。通常认为这是一个可能的误差源,可以通过合理的近似来忽略。我们通过量化幽灵谱系对流行的 D 统计方法检测到的渐渗的影响,来探索缺失物种对进化研究的可能影响。我们从模拟数据中表明,在某些常见情况下,如果不考虑幽灵谱系,水平基因流动的供体和受体可能会被错误识别。特别是,使用通常推荐的远缘外群会导致错误概率增加,并且在大多数情况下会导致错误解释。我们的结论是,应该系统地考虑来自幽灵谱系的渐渗是另一种可能的、甚至是更有可能的情况。[ABBA-BABA;D 统计;基因流动;幽灵谱系;渐渗;模拟。]