利用加权四重奏分布从全基因组数据中增强物种树推断

Leveraging Weighted Quartet Distributions for Enhanced Species Tree Inference from Genome-Wide Data.

作者信息

Hasan Navid Bin, Biswas Avijit, Wahab Zahin, Mahbub Mahim, Reaz Rezwana, Bayzid Md Shamsuzzoha

机构信息

Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.

出版信息

Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf159.

Abstract

Species tree estimation from genes sampled from throughout the whole genome is challenging because of gene tree discordance, often caused by incomplete lineage sorting (ILS). Quartet-based summary methods for estimating species trees from a collection of gene trees are becoming popular due to their high accuracy and theoretical guarantees of robustness to arbitrarily high amounts of ILS. ASTRAL, the most widely used quartet-based method, aims to infer species trees by maximizing the number of quartets in the gene trees consistent with the species tree. An alternative approach is inferring quartets for all subsets of four species and amalgamating them into a coherent species tree. While summary methods can be sensitive to gene tree estimation error, quartet amalgamation offers an advantage by potentially bypassing gene tree estimation. However, greatly understudied is the choice of weighted quartet inference method and downstream effects on species tree estimations under realistic model conditions. In this study, we investigated a wide array of methods for generating weighted quartets and critically assessed their impact on species tree inference. Our study provides evidence that the careful generation and amalgamation of weighted quartets, as implemented in methods like wQFM, can lead to significantly more accurate trees than popular methods like ASTRAL, especially in the face of gene tree estimation errors.

摘要

由于基因树不一致(通常由不完全谱系分选(ILS)引起),从全基因组中采样的基因进行物种树估计具有挑战性。基于四重奏的汇总方法,即从一组基因树中估计物种树,因其高精度以及对任意大量ILS具有稳健性的理论保证而越来越受欢迎。ASTRAL是使用最广泛的基于四重奏的方法,旨在通过最大化与物种树一致的基因树中的四重奏数量来推断物种树。另一种方法是为四个物种的所有子集推断四重奏,并将它们合并成一个连贯的物种树。虽然汇总方法可能对基因树估计误差敏感,但四重奏合并通过潜在地绕过基因树估计提供了一个优势。然而,在现实模型条件下,加权四重奏推断方法的选择及其对物种树估计的下游影响却很少被研究。在本研究中,我们研究了一系列生成加权四重奏的方法,并严格评估了它们对物种树推断的影响。我们的研究提供了证据,表明像wQFM这样的方法中所实现的加权四重奏的精心生成和合并,能够比像ASTRAL这样的流行方法产生显著更准确的树,特别是在面对基因树估计误差时。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9528/12401674/32010d80ab94/evaf159f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索