Division of Biostatistics, College of Public Health, and Department of Statistics, The Ohio State University, Columbus, OH 43210, USA.
Department of Statistics and Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, Columbus, OH 43210, USA.
Stat Appl Genet Mol Biol. 2024 Feb 19;23(1). doi: 10.1515/sagmb-2022-0061. eCollection 2024 Jan 1.
Methods based on the multi-species coalescent have been widely used in phylogenetic tree estimation using genome-scale DNA sequence data to understand the underlying evolutionary relationship between the sampled species. Evolutionary processes such as hybridization, which creates new species through interbreeding between two different species, necessitate inferring a species network instead of a species tree. A species tree is strictly bifurcating and thus fails to incorporate hybridization events which require an internal node of degree three. Hence, it is crucial to decide whether a tree or network analysis should be performed given a DNA sequence data set, a decision that is based on the presence of hybrid species in the sampled species. Although many methods have been proposed for hybridization detection, it is rare to find a technique that does so globally while considering a data generation mechanism that allows both hybridization and incomplete lineage sorting. In this paper, we consider hybridization and coalescence in a unified framework and propose a new test that can detect whether there are any hybrid species in a set of species of arbitrary size. Based on this global test of hybridization, one can decide whether a tree or network analysis is appropriate for a given data set.
基于多物种合并的方法已被广泛应用于使用基因组规模的 DNA 序列数据估计系统发育树,以了解采样物种之间的潜在进化关系。杂交等进化过程通过两种不同物种之间的杂交创造新物种,因此需要推断物种网络而不是物种树。物种树是严格二分的,因此无法包含需要三度内部节点的杂交事件。因此,鉴于 DNA 序列数据集,决定是否应进行树或网络分析至关重要,这一决定基于采样物种中是否存在杂交物种。尽管已经提出了许多用于检测杂交的方法,但很少有技术能够在考虑允许杂交和不完全谱系分选的数据生成机制的情况下进行全局检测。在本文中,我们在统一框架中考虑杂交和合并,并提出了一种新的测试方法,可以检测一组任意大小的物种中是否存在任何杂交物种。基于这种对杂交的全局测试,可以决定是否为给定数据集选择树或网络分析。