• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

STELAR:一种基于最大三重一致性的统计一致的合并物种树估计方法。

STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency.

机构信息

Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka, 1205, Bangladesh.

Department of Computer Science, The University of Texas at Austin, Texas, 78712, USA.

出版信息

BMC Genomics. 2020 Feb 10;21(1):136. doi: 10.1186/s12864-020-6519-y.

DOI:10.1186/s12864-020-6519-y
PMID:32039704
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7011378/
Abstract

BACKGROUND

Species tree estimation is frequently based on phylogenomic approaches that use multiple genes from throughout the genome. However, estimating a species tree from a collection of gene trees can be complicated due to the presence of gene tree incongruence resulting from incomplete lineage sorting (ILS), which is modelled by the multi-species coalescent process. Maximum likelihood and Bayesian MCMC methods can potentially result in accurate trees, but they do not scale well to large datasets.

RESULTS

We present STELAR (Species Tree Estimation by maximizing tripLet AgReement), a new fast and highly accurate statistically consistent coalescent-based method for estimating species trees from a collection of gene trees. We formalized the constrained triplet consensus (CTC) problem and showed that the solution to the CTC problem is a statistically consistent estimate of the species tree under the multi-species coalescent (MSC) model. STELAR is an efficient dynamic programming based solution to the CTC problem which is highly accurate and scalable. We evaluated the accuracy of STELAR in comparison with SuperTriplets, which is an alternate fast and highly accurate triplet-based supertree method, and with MP-EST and ASTRAL - two of the most popular and accurate coalescent-based methods. Experimental results suggest that STELAR matches the accuracy of ASTRAL and improves on MP-EST and SuperTriplets.

CONCLUSIONS

Theoretical and empirical results (on both simulated and real biological datasets) suggest that STELAR is a valuable technique for species tree estimation from gene tree distributions.

摘要

背景

物种树估计通常基于使用整个基因组中多个基因的系统基因组学方法。然而,由于不完全谱系分选(ILS)导致的基因树不一致的存在,从基因树集合中估计物种树可能会变得复杂,ILS 通过多物种合并过程进行建模。最大似然和贝叶斯 MCMC 方法可能会产生准确的树,但它们不适用于大型数据集。

结果

我们提出了 STELAR(通过最大化三联体一致性估计物种树),这是一种新的快速且高度准确的基于合并的方法,用于从基因树集合中估计物种树。我们形式化了约束三联体共识(CTC)问题,并表明 CTC 问题的解决方案是多物种合并(MSC)模型下物种树的统计一致估计。STELAR 是 CTC 问题的一种高效的动态规划解决方案,具有高度准确性和可扩展性。我们将 STELAR 的准确性与 SuperTriplets 进行了比较,SuperTriplets 是一种替代的快速且高度准确的基于三联体的超树方法,以及与 MP-EST 和 ASTRAL - 两种最流行和准确的基于合并的方法进行了比较。实验结果表明,STELAR 与 ASTRAL 的准确性相匹配,并优于 MP-EST 和 SuperTriplets。

结论

理论和经验结果(包括模拟和真实生物数据集)表明,STELAR 是从基因树分布估计物种树的一种有价值的技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/743bffcb5613/12864_2020_6519_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/2473745d1dd7/12864_2020_6519_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/413dea89545a/12864_2020_6519_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/ad678d12369d/12864_2020_6519_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/a47571801f4a/12864_2020_6519_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/635c49b8cb37/12864_2020_6519_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/15a4796ec016/12864_2020_6519_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/743bffcb5613/12864_2020_6519_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/2473745d1dd7/12864_2020_6519_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/413dea89545a/12864_2020_6519_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/ad678d12369d/12864_2020_6519_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/a47571801f4a/12864_2020_6519_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/635c49b8cb37/12864_2020_6519_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/15a4796ec016/12864_2020_6519_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e13/7011378/743bffcb5613/12864_2020_6519_Fig7_HTML.jpg

相似文献

1
STELAR: a statistically consistent coalescent-based species tree estimation method by maximizing triplet consistency.STELAR:一种基于最大三重一致性的统计一致的合并物种树估计方法。
BMC Genomics. 2020 Feb 10;21(1):136. doi: 10.1186/s12864-020-6519-y.
2
The performance of coalescent-based species tree estimation methods under models of missing data.基于合并的种系发生树估计方法在缺失数据模型下的性能。
BMC Genomics. 2018 May 8;19(Suppl 5):286. doi: 10.1186/s12864-018-4619-8.
3
ASTRAL: genome-scale coalescent-based species tree estimation.ASTRAL:基于基因组规模合并的物种树估计。
Bioinformatics. 2014 Sep 1;30(17):i541-8. doi: 10.1093/bioinformatics/btu462.
4
To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods.包含还是不包含:基因过滤对物种树估计方法的影响。
Syst Biol. 2018 Mar 1;67(2):285-303. doi: 10.1093/sysbio/syx077.
5
A comparative study of SVDquartets and other coalescent-based species tree estimation methods.SVDquartets与其他基于溯祖理论的物种树估计方法的比较研究。
BMC Genomics. 2015;16 Suppl 10(Suppl 10):S2. doi: 10.1186/1471-2164-16-S10-S2. Epub 2015 Oct 2.
6
Coalescent-Based Analyses of Genomic Sequence Data Provide a Robust Resolution of Phylogenetic Relationships among Major Groups of Gibbons.基于合并的基因组序列数据分析为大巽他群岛长臂猿主要类群的系统发育关系提供了强有力的解决办法。
Mol Biol Evol. 2018 Jan 1;35(1):159-179. doi: 10.1093/molbev/msx277.
7
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses.加权统计分箱法:实现统计上一致的全基因组系统发育分析
PLoS One. 2015 Jun 18;10(6):e0129183. doi: 10.1371/journal.pone.0129183. eCollection 2015.
8
ASTRID: Accurate Species TRees from Internode Distances.ASTRID:基于节间距离的精确物种树
BMC Genomics. 2015;16 Suppl 10(Suppl 10):S3. doi: 10.1186/1471-2164-16-S10-S3. Epub 2015 Oct 2.
9
wQFM: highly accurate genome-scale species tree estimation from weighted quartets.wQFM:基于加权四重奏的高精度基因组规模物种树估计
Bioinformatics. 2021 Nov 5;37(21):3734-3743. doi: 10.1093/bioinformatics/btab428.
10
Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies.基于快速合并算法从四重奏频率计算局部分支支持度
Mol Biol Evol. 2016 Jul;33(7):1654-68. doi: 10.1093/molbev/msw079. Epub 2016 Apr 15.

引用本文的文献

1
Leveraging Weighted Quartet Distributions for Enhanced Species Tree Inference from Genome-Wide Data.利用加权四重奏分布从全基因组数据中增强物种树推断
Genome Biol Evol. 2025 Sep 2;17(9). doi: 10.1093/gbe/evaf159.
2
wQFM-TREE: highly accurate and scalable quartet-based species tree inference from gene trees.wQFM-TREE:基于四重奏从基因树中进行高精度且可扩展的物种树推断。
Bioinform Adv. 2025 Mar 13;5(1):vbaf053. doi: 10.1093/bioadv/vbaf053. eCollection 2025.
3
wQFM-DISCO: DISCO-enabled wQFM improves phylogenomic analyses despite the presence of paralogs.

本文引用的文献

1
ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees.ASTRAL-III:从部分解析的基因树重建多项式时间种系发生树。
BMC Bioinformatics. 2018 May 8;19(Suppl 6):153. doi: 10.1186/s12859-018-2129-y.
2
Gene tree parsimony for incomplete gene trees: addressing true biological loss.针对不完整基因树的基因树简约法:解决真正的生物学损失问题。
Algorithms Mol Biol. 2018 Jan 19;13:1. doi: 10.1186/s13015-017-0120-1. eCollection 2018.
3
Incomplete Lineage Sorting in Mammalian Phylogenomics.哺乳动物系统发育基因组学中的不完全谱系分选
wQFM-DISCO:尽管存在旁系同源物,但启用DISCO的wQFM改善了系统发育基因组分析。
Bioinform Adv. 2024 Nov 27;4(1):vbae189. doi: 10.1093/bioadv/vbae189. eCollection 2024.
4
Terraces in species tree inference from gene trees.从基因树上推断物种树的阶。
BMC Ecol Evol. 2024 Nov 4;24(1):135. doi: 10.1186/s12862-024-02309-z.
5
Dollo-CDP: a polynomial-time algorithm for the clade-constrained large Dollo parsimony problem.多洛 - CDP:一种用于分支约束大型多洛简约问题的多项式时间算法。
Algorithms Mol Biol. 2024 Jan 8;19(1):2. doi: 10.1186/s13015-023-00249-9.
6
Quartets enable statistically consistent estimation of cell lineage trees under an unbiased error and missingness model.四重奏法能够在无偏误差和缺失模型下对细胞谱系树进行统计上一致的估计。
Algorithms Mol Biol. 2023 Dec 1;18(1):19. doi: 10.1186/s13015-023-00248-w.
7
OrthoMaM v12: a database of curated single-copy ortholog alignments and trees to study mammalian evolutionary genomics.OrthoMaM v12:一个经过精心整理的单拷贝直系同源物比对和树数据库,用于研究哺乳动物进化基因组学。
Nucleic Acids Res. 2024 Jan 5;52(D1):D529-D535. doi: 10.1093/nar/gkad834.
8
TreeTerminus -creating transcript trees using inferential replicate counts.TreeTerminus - 使用推断重复计数创建转录本树。
iScience. 2023 May 25;26(6):106961. doi: 10.1016/j.isci.2023.106961. eCollection 2023 Jun 16.
9
Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices.基于机器学习的填补技术,用于从不完全距离矩阵估计系统发育树。
BMC Genomics. 2020 Jul 20;21(1):497. doi: 10.1186/s12864-020-06892-5.
Syst Biol. 2017 Jan 1;66(1):112-120. doi: 10.1093/sysbio/syw082.
4
Phylogenomics Controlling for Base Compositional Bias Reveals a Single Origin of Eusociality in Corbiculate Bees.系统基因组学控制碱基组成偏差揭示了熊蜂超社会性的单一起源。
Mol Biol Evol. 2016 Mar;33(3):670-8. doi: 10.1093/molbev/msv258. Epub 2015 Nov 17.
5
ASTRID: Accurate Species TRees from Internode Distances.ASTRID:基于节间距离的精确物种树
BMC Genomics. 2015;16 Suppl 10(Suppl 10):S3. doi: 10.1186/1471-2164-16-S10-S3. Epub 2015 Oct 2.
6
ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes.ASTRAL-II:基于合并的数百个分类群和数千个基因的种系发生树估计。
Bioinformatics. 2015 Jun 15;31(12):i44-52. doi: 10.1093/bioinformatics/btv234.
7
Disk covering methods improve phylogenomic analyses.磁盘覆盖方法改进了系统发育基因组学分析。
BMC Genomics. 2014;15 Suppl 6(Suppl 6):S7. doi: 10.1186/1471-2164-15-S6-S7. Epub 2014 Oct 17.
8
Likelihood-based tree reconstruction on a concatenation of aligned sequence data sets can be statistically inconsistent.基于比对序列数据集串联的似然法树重建可能在统计上不一致。
Theor Popul Biol. 2015 Mar;100C:56-62. doi: 10.1016/j.tpb.2014.12.005. Epub 2014 Dec 26.
9
Statistical binning enables an accurate coalescent-based estimation of the avian tree.统计分箱可实现基于合并的鸟类树的精确估计。
Science. 2014 Dec 12;346(6215):1250463. doi: 10.1126/science.1250463. Epub 2014 Dec 11.
10
Whole-genome analyses resolve early branches in the tree of life of modern birds.全基因组分析解决了现代鸟类生命之树早期分支的问题。
Science. 2014 Dec 12;346(6215):1320-31. doi: 10.1126/science.1253451.