Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China.
Department of Biological Sciences, National University of Singapore, Singapore 117543, Republic of Singapore.
Syst Biol. 2021 Aug 11;70(5):961-975. doi: 10.1093/sysbio/syab024.
Phylogenetic trees based on genome-wide sequence data may not always represent the true evolutionary history for a variety of reasons. One process that can lead to incorrect reconstruction of species phylogenies is gene flow, especially if interspecific gene flow has affected large parts of the genome. We investigated phylogenetic relationships within a clade comprising eight species of passerine birds (Phylloscopidae, Phylloscopus, leaf warblers) using one de novo genome assembly and 78 resequenced genomes. On the basis of hypothesis-exclusion trials based on D-statistics, phylogenetic network analysis, and demographic inference analysis, we identified ancient gene flow affecting large parts of the genome between one species and the ancestral lineage of a sister species pair. This ancient gene flow consistently caused erroneous reconstruction of the phylogeny when using large amounts of genome-wide sequence data. In contrast, the true relationships were captured when smaller parts of the genome were analyzed, showing that the "winner-takes-all democratic majority tree" is not necessarily the true species tree. Under this condition, smaller amounts of data may sometimes avoid the effects of gene flow due to stochastic sampling, as hidden reticulation histories are more likely to emerge from the use of larger data sets, especially whole-genome data sets. In addition, we also found that genomic regions affected by ancient gene flow generally exhibited higher genomic differentiation but a lower recombination rate and nucleotide diversity. Our study highlights the importance of considering reticulation in phylogenetic reconstructions in the genomic era.[Bifurcation; introgression; recombination; reticulation; Phylloscopus.].
基于全基因组序列数据的系统发育树并不总是能代表真实的进化历史,这是由于多种原因造成的。一个可能导致物种系统发育重建不正确的过程是基因流,特别是如果种间基因流影响了基因组的大部分。我们使用从头组装的基因组和 78 个重测序的基因组,研究了一个包括八种雀形目鸟类(柳莺科,柳莺属,叶莺)的分支内的系统发育关系。基于基于 D 统计量的假设排除试验、系统发育网络分析和种群推断分析,我们鉴定出了一种古老的基因流,这种基因流影响了一个物种和一个姐妹种对的祖先谱系之间的大部分基因组。当使用大量全基因组序列数据时,这种古老的基因流一致地导致了系统发育的错误重建。相比之下,当分析基因组的较小部分时,就能捕捉到真实的关系,这表明“胜者全拿的民主多数树”不一定是真实的种系发生树。在这种情况下,由于随机抽样,较小的数据量有时可能会避免基因流的影响,因为隐藏的网状进化历史更有可能从使用更大的数据集,特别是全基因组数据集,中显现出来。此外,我们还发现,受到古老基因流影响的基因组区域通常表现出更高的基因组分化,但较低的重组率和核苷酸多样性。我们的研究强调了在基因组时代的系统发育重建中考虑网状进化的重要性。