University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA.
University of Hawaii, College of Tropical Agriculture and Human Resources, Department of Plant and Environmental Protection Sciences, Entomology Section, 3050 Maile Way, Honolulu, HI, 96822-2231, USA.
Mol Phylogenet Evol. 2023 Nov;188:107892. doi: 10.1016/j.ympev.2023.107892. Epub 2023 Jul 29.
As genomic data proliferates, the prevalence of post-speciation gene flow is making species boundaries and relationships increasingly ambiguous. Although current approaches inferring fully bifurcating phylogenies based on concatenated datasets provide simple and robust answers to many species relationships, they may be inaccurate because the models ignore inter-specific gene flow and incomplete lineage sorting. To examine the potential error resulting from ignoring gene flow, we generated both a RAD-seq and a 500 protein-coding loci highly multiplexed amplicon (HiMAP) dataset for a monophyletic group of 12 species defined as the Bactrocera dorsalis sensu lato clade. With some of the world's worst agricultural pests, the taxonomy of the B. dorsalis s.l. clade is important for trade and quarantines. However, taxonomic confusion confounds resolution due to intra- and interspecific phenotypic variation and convergence, mitochondrial introgression across half of the species, and viable hybrids. We compared the topological convergence of our datasets using concatenated phylogenetic and various multispecies coalescent approaches, some of which account for gene flow. All analyses agreed on species delimitation, but there was incongruence between species relationships. Under concatenation, both datasets suggest identical species relationships with mostly high statistical support. However, multispecies coalescent and multispecies network approaches suggest markedly different hypotheses and detected significant gene flow. We suggest that the network approaches are likely more accurate because gene flow violates the assumptions of the concatenated phylogenetic analyses, but the data-reductive requirements of network approaches resulted in reduced statistical support and could not unambiguously resolve gene flow directions. Our study highlights the importance of testing for gene flow, particularly with phylogenomic datasets, even when concatenated approaches receive high statistical support.
随着基因组数据的激增,种间基因流的普遍存在使得物种边界和关系越来越模糊。虽然基于串联数据集推断完全分支系统发育的当前方法为许多物种关系提供了简单而可靠的答案,但它们可能不准确,因为这些模型忽略了种间基因流和不完全谱系分选。为了研究忽略基因流可能导致的潜在误差,我们为一个单系的 12 个物种(定义为 Bactrocera dorsalis 狭义种系群)生成了 RAD-seq 和 500 个蛋白质编码基因高多重扩增子(HiMAP)数据集。这些物种是世界上一些最严重的农业害虫,B. dorsalis 狭义种系群的分类对于贸易和检疫至关重要。然而,由于种内和种间表型变异和趋同、线粒体在一半物种中的渗入以及可育杂种的存在,分类学上的混淆使得分类学问题更加复杂。我们比较了使用串联系统发育和各种多物种合并方法(其中一些方法考虑了基因流)的数据集拓扑收敛性,所有分析都同意物种界限,但物种关系存在不一致。在串联分析中,两个数据集都表明具有相同的物种关系,并且具有高度的统计支持。然而,多物种合并和多物种网络方法提出了明显不同的假设,并检测到了显著的基因流。我们认为网络方法可能更准确,因为基因流违反了串联系统发育分析的假设,但网络方法的数据简化要求导致了统计支持的降低,并且不能明确确定基因流的方向。我们的研究强调了测试基因流的重要性,特别是对于基因组数据集,即使串联方法得到了高度的统计支持。