Bioinformatics Research Center (BiRC), Aarhus University, Aarhus, Denmark.
Department of Organismic and Evolutionary Biology, Harvard University, Massachusetts, United States of America.
PLoS Genet. 2024 Feb 8;20(2):e1010836. doi: 10.1371/journal.pgen.1010836. eCollection 2024 Feb.
Genome-wide genealogies of multiple species carry detailed information about demographic and selection processes on individual branches of the phylogeny. Here, we introduce TRAILS, a hidden Markov model that accurately infers time-resolved population genetics parameters, such as ancestral effective population sizes and speciation times, for ancestral branches using a multi-species alignment of three species and an outgroup. TRAILS leverages the information contained in incomplete lineage sorting fragments by modelling genealogies along the genome as rooted three-leaved trees, each with a topology and two coalescent events happening in discretized time intervals within the phylogeny. Posterior decoding of the hidden Markov model can be used to infer the ancestral recombination graph for the alignment and details on demographic changes within a branch. Since TRAILS performs posterior decoding at the base-pair level, genome-wide scans based on the posterior probabilities can be devised to detect deviations from neutrality. Using TRAILS on a human-chimp-gorilla-orangutan alignment, we recover speciation parameters and extract information about the topology and coalescent times at high resolution.
多物种的全基因组系统发育树携带了关于个体分支的进化史的种群和选择过程的详细信息。在这里,我们引入了 TRAILS,这是一种隐马尔可夫模型,它可以使用三个物种和一个外群的多物种比对,准确推断祖先分支的时间分辨种群遗传学参数,例如祖先有效种群大小和物种形成时间。TRAILS 通过将基因组上的系统发育建模为有根三叶树来利用不完全谱系排序片段中的信息,每个三叶树都有一个拓扑结构和两个在离散时间间隔内发生的合并事件。隐马尔可夫模型的后验解码可用于推断比对的祖先重组图和分支内的人口变化细节。由于 TRAILS 在碱基对水平上进行后验解码,因此可以设计基于后验概率的全基因组扫描来检测偏离中性的情况。在人类-黑猩猩-大猩猩-猩猩比对上使用 TRAILS,我们恢复了物种形成参数,并以高分辨率提取了拓扑结构和合并时间的信息。