• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

wQFM-DISCO:尽管存在旁系同源物,但启用DISCO的wQFM改善了系统发育基因组分析。

wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs.

作者信息

Hakim Sheikh Azizul, Ratul Md Rownok Zahan, Bayzid Md Shamsuzzoha

机构信息

Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka 1205, Bangladesh.

出版信息

Bioinform Adv. 2024 Nov 27;4(1):vbae189. doi: 10.1093/bioadv/vbae189. eCollection 2024.

DOI:10.1093/bioadv/vbae189
PMID:39664861
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11634537/
Abstract

MOTIVATION

Gene trees often differ from the species trees that contain them due to various factors, including incomplete lineage sorting (ILS) and gene duplication and loss (GDL). Several highly accurate species tree estimation methods have been introduced to explicitly address ILS, including ASTRAL, a widely used statistically consistent method, and wQFM, a quartet amalgamation approach experimentally shown to be more accurate than ASTRAL. Two recent advancements, ASTRAL-Pro and DISCO, have emerged in phylogenomics to consider GDL. ASTRAL-Pro introduces a refined quartet similarity measure, accounting for orthology and paralogy. On the other hand, DISCO offers a general strategy to decompose multi-copy gene trees into a collection of single-copy trees, allowing the utilization of methods previously designed for species tree inference in the context of single-copy gene trees.

RESULTS

In this study, we first introduce some variants of DISCO to examine its underlying hypotheses and present analytical results on the statistical guarantees of DISCO. In particular, we introduce DISCO-R, a variant of DISCO with a refined and improved pruning strategy that provides more accurate and robust results. We then demonstrate with extensive evaluation studies on a collection of simulated and real data sets that wQFM paired with DISCO variants consistently matches or outperforms ASTRAL-Pro and other competing methods.

AVAILABILITY AND IMPLEMENTATION

DISCO-R and other variants are freely available at https://github.com/skhakim/DISCO-variants.

摘要

动机

由于多种因素,包括不完全谱系分选(ILS)以及基因复制和丢失(GDL),基因树往往与包含它们的物种树不同。已经引入了几种高度准确的物种树估计方法来明确解决ILS问题,包括广泛使用的具有统计一致性的方法ASTRAL,以及实验证明比ASTRAL更准确的四重奏合并方法wQFM。在系统发育基因组学中出现了两项最新进展,即ASTRAL-Pro和DISCO,以考虑GDL。ASTRAL-Pro引入了一种改进的四重奏相似性度量,考虑了直系同源和旁系同源关系。另一方面,DISCO提供了一种通用策略,可将多拷贝基因树分解为单拷贝树的集合,从而能够利用先前为单拷贝基因树背景下的物种树推断而设计的方法。

结果

在本研究中,我们首先引入了DISCO的一些变体,以检验其潜在假设,并给出关于DISCO统计保证的分析结果。特别是,我们引入了DISCO-R,它是DISCO的一个变体,具有改进和优化的剪枝策略,能提供更准确和稳健的结果。然后,我们通过对一组模拟和真实数据集进行广泛的评估研究表明,wQFM与DISCO变体相结合始终能与ASTRAL-Pro及其他竞争方法相匹配或表现更优。

可用性和实现方式

DISCO-R和其他变体可在https://github.com/skhakim/DISCO-variants上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c5b/11634537/3528f283f794/vbae189f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c5b/11634537/3528f283f794/vbae189f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c5b/11634537/3528f283f794/vbae189f3.jpg

相似文献

1
wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs.wQFM-DISCO:尽管存在旁系同源物,但启用DISCO的wQFM改善了系统发育基因组分析。
Bioinform Adv. 2024 Nov 27;4(1):vbae189. doi: 10.1093/bioadv/vbae189. eCollection 2024.
2
wQFM: highly accurate genome-scale species tree estimation from weighted quartets.wQFM:基于加权四重奏的高精度基因组规模物种树估计
Bioinformatics. 2021 Nov 5;37(21):3734-3743. doi: 10.1093/bioinformatics/btab428.
3
wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees.wQFM-TREE:基于四重奏从基因树中进行高精度且可扩展的物种树推断。
Bioinform Adv. 2025 Mar 13;5(1):vbaf053. doi: 10.1093/bioadv/vbaf053. eCollection 2025.
4
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.ASTRAL-Pro:基于四重奏的系统发生树推断,即便存在基因重复。
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307. doi: 10.1093/molbev/msaa139.
5
DISCO+QR: rooting species trees in the presence of GDL and ILS.DISCO+QR:在存在基因水平转移(GDL)和不完全谱系分选(ILS)的情况下确定物种树的根。
Bioinform Adv. 2023 Feb 7;3(1):vbad015. doi: 10.1093/bioadv/vbad015. eCollection 2023.
6
DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.利用多拷贝基因家族树分解进行种系树推断。
Syst Biol. 2022 Apr 19;71(3):610-629. doi: 10.1093/sysbio/syab070.
7
A comparative study of SVDquartets and other coalescent-based species tree estimation methods.SVDquartets与其他基于溯祖理论的物种树估计方法的比较研究。
BMC Genomics. 2015;16 Suppl 10(Suppl 10):S2. doi: 10.1186/1471-2164-16-S10-S2. Epub 2015 Oct 2.
8
Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss.多项式时间下基因重复和缺失下种系树的统计估计
J Comput Biol. 2021 May;28(5):452-468. doi: 10.1089/cmb.2020.0424. Epub 2020 Dec 15.
9
Species tree branch length estimation despite incomplete lineage sorting, duplication, and loss.尽管存在不完全谱系分选、基因重复和基因丢失的情况,仍对物种树分支长度进行估计。
bioRxiv. 2025 Feb 20:2025.02.20.639320. doi: 10.1101/2025.02.20.639320.
10
Species Tree Inference Methods Intended to Deal with Incomplete Lineage Sorting Are Robust to the Presence of Paralogs.旨在处理不完全谱系分类的种系树推断方法对旁系同源基因的存在具有稳健性。
Syst Biol. 2022 Feb 10;71(2):367-381. doi: 10.1093/sysbio/syab056.

本文引用的文献

1
Quartet Fiduccia-Mattheyses revisited for larger phylogenetic studies.重新探讨 Fiduccia-Mattheyses 四重奏在更大的系统发育研究中的应用。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad332.
2
Inferring Optimal Species Trees in the Presence of Gene Duplication and Loss: Beyond Rooted Gene Trees.在存在基因复制和丢失的情况下推断最优物种树:超越有根基因树。
J Comput Biol. 2023 Feb;30(2):161-175. doi: 10.1089/cmb.2021.0522. Epub 2022 Oct 13.
3
ASTRAL-Pro 2: ultrafast species tree reconstruction from multi-copy gene family trees.
ASTRAL-Pro 2:从多拷贝基因家族树重建超快种系发生树。
Bioinformatics. 2022 Oct 31;38(21):4949-4950. doi: 10.1093/bioinformatics/btac620.
4
Quartet Based Gene Tree Imputation Using Deep Learning Improves Phylogenomic Analyses Despite Missing Data.基于四重奏的深度学习基因树推断在存在缺失数据的情况下仍能改进系统发育基因组分析。
J Comput Biol. 2022 Nov;29(11):1156-1172. doi: 10.1089/cmb.2022.0212. Epub 2022 Sep 1.
5
SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss.SpeciesRax:一种用于在基因家族树中进行复制、转移和丢失的最大似然种系发生树推断的工具。
Mol Biol Evol. 2022 Feb 3;39(2). doi: 10.1093/molbev/msab365.
6
DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.利用多拷贝基因家族树分解进行种系树推断。
Syst Biol. 2022 Apr 19;71(3):610-629. doi: 10.1093/sysbio/syab070.
7
wQFM: highly accurate genome-scale species tree estimation from weighted quartets.wQFM:基于加权四重奏的高精度基因组规模物种树估计
Bioinformatics. 2021 Nov 5;37(21):3734-3743. doi: 10.1093/bioinformatics/btab428.
8
Quartet-based inference is statistically consistent under the unified duplication-loss-coalescence model.基于四重体的推断在统一的复制-丢失-合并模型下是统计一致的。
Bioinformatics. 2021 Nov 18;37(22):4064-4074. doi: 10.1093/bioinformatics/btab414.
9
Polynomial-Time Statistical Estimation of Species Trees Under Gene Duplication and Loss.多项式时间下基因重复和缺失下种系树的统计估计
J Comput Biol. 2021 May;28(5):452-468. doi: 10.1089/cmb.2020.0424. Epub 2020 Dec 15.
10
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.ASTRAL-Pro:基于四重奏的系统发生树推断,即便存在基因重复。
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307. doi: 10.1093/molbev/msaa139.