• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估从多拷贝基因推断物种树的方法。

Assessing approaches for inferring species trees from multi-copy genes.

作者信息

Chaudhary Ruchi, Boussau Bastien, Burleigh J Gordon, Fernández-Baca David

机构信息

Department of Computer Science, Iowa State University, Ames, IA 50011, USA; Department of Biology, University of Florida, Gainesville, FL 32611, USA; and Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne F-69622, France Department of Computer Science, Iowa State University, Ames, IA 50011, USA; Department of Biology, University of Florida, Gainesville, FL 32611, USA; and Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne F-69622, France

Department of Computer Science, Iowa State University, Ames, IA 50011, USA; Department of Biology, University of Florida, Gainesville, FL 32611, USA; and Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, Villeurbanne F-69622, France.

出版信息

Syst Biol. 2015 Mar;64(2):325-39. doi: 10.1093/sysbio/syu128. Epub 2014 Dec 23.

DOI:10.1093/sysbio/syu128
PMID:25540456
Abstract

With the availability of genomic sequence data, there is increasing interest in using genes with a possible history of duplication and loss for species tree inference. Here we assess the performance of both nonprobabilistic and probabilistic species tree inference approaches using gene duplication and loss and coalescence simulations. We evaluated the performance of gene tree parsimony (GTP) based on duplication (Only-dup), duplication and loss (Dup-loss), and deep coalescence (Deep-c) costs, the NJst distance method, the MulRF supertree method, and PHYLDOG, which jointly estimates gene trees and species tree using a hierarchical probabilistic model. We examined the effects of gene tree and species sampling, gene tree error, and duplication and loss rates on the accuracy of phylogenetic estimates. In the 10-taxon duplication and loss simulation experiments, MulRF is more accurate than the other methods when the duplication and loss rates are low, and Dup-loss is generally the most accurate when the duplication and loss rates are high. PHYLDOG performs well in 10-taxon duplication and loss simulations, but its run time is prohibitively long on larger data sets. In the larger duplication and loss simulation experiments, MulRF outperforms all other methods in experiments with at most 100 taxa; however, in the larger simulation, Dup-loss generally performs best. In all duplication and loss simulation experiments with more than 10 taxa, all methods perform better with more gene trees and fewer missing sequences, and they are all affected by gene tree error. Our results also highlight high levels of error in estimates of duplications and losses from GTP methods and demonstrate the usefulness of methods based on generic tree distances for large analyses.

摘要

随着基因组序列数据的可得性,人们越来越有兴趣使用可能经历过复制和丢失的基因来进行物种树推断。在这里,我们使用基因复制和丢失以及合并模拟来评估非概率和概率物种树推断方法的性能。我们基于复制(仅复制)、复制和丢失(复制-丢失)以及深度合并(深度合并)成本评估了基因树简约法(GTP)的性能、NJst距离方法、MulRF超级树方法以及PHYLDOG,后者使用分层概率模型联合估计基因树和物种树。我们研究了基因树和物种抽样、基因树错误以及复制和丢失率对系统发育估计准确性的影响。在10分类群的复制和丢失模拟实验中,当复制和丢失率较低时,MulRF比其他方法更准确,而当复制和丢失率较高时,复制-丢失通常是最准确的。PHYLDOG在10分类群的复制和丢失模拟中表现良好,但在更大的数据集上其运行时间长得令人望而却步。在更大规模的复制和丢失模拟实验中,在最多100个分类群的实验中,MulRF优于所有其他方法;然而,在更大规模的模拟中,复制-丢失通常表现最佳。在所有超过10个分类群的复制和丢失模拟实验中,所有方法在有更多基因树和更少缺失序列时表现更好,并且它们都受到基因树错误的影响。我们的结果还突出了GTP方法在复制和丢失估计中的高误差水平,并证明了基于通用树距离的方法在大型分析中的有用性。

相似文献

1
Assessing approaches for inferring species trees from multi-copy genes.评估从多拷贝基因推断物种树的方法。
Syst Biol. 2015 Mar;64(2):325-39. doi: 10.1093/sysbio/syu128. Epub 2014 Dec 23.
2
Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance.使用罗宾逊-福尔兹距离从不一致的多拷贝基因树推断物种树。
Algorithms Mol Biol. 2013 Nov 1;8(1):28. doi: 10.1186/1748-7188-8-28.
3
The accuracy of species tree estimation under simulation: a comparison of methods.基于模拟的物种树估计精度:方法比较。
Syst Biol. 2011 Mar;60(2):126-37. doi: 10.1093/sysbio/syq073. Epub 2010 Nov 18.
4
MulRF: a software package for phylogenetic analysis using multi-copy gene trees.MulRF:一个使用多拷贝基因树进行系统发育分析的软件包。
Bioinformatics. 2015 Feb 1;31(3):432-3. doi: 10.1093/bioinformatics/btu648. Epub 2014 Oct 1.
5
Species Tree Inference Using a Mixture Model.使用混合模型进行种系发生树推断。
Mol Biol Evol. 2015 Sep;32(9):2469-82. doi: 10.1093/molbev/msv115. Epub 2015 May 11.
6
From gene trees to species trees II: species tree inference by minimizing deep coalescence events.从基因树到物种树 II:通过最小化深合并事件进行物种树推断。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1685-91. doi: 10.1109/TCBB.2011.83.
7
Integrating Sequence Evolution into Probabilistic Orthology Analysis.将序列进化纳入概率同源分析。
Syst Biol. 2015 Nov;64(6):969-82. doi: 10.1093/sysbio/syv044. Epub 2015 Jun 30.
8
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.最大似然估计物种树:系统发育推断的准确性如何取决于分歧历史和采样设计。
Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20.
9
Inferring angiosperm phylogeny from EST data with widespread gene duplication.利用广泛存在的基因复制从EST数据推断被子植物系统发育
BMC Evol Biol. 2007 Feb 8;7 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2148-7-S1-S3.
10
Exact solutions for species tree inference from discordant gene trees.从不一致的基因树推断物种树的精确解。
J Bioinform Comput Biol. 2013 Oct;11(5):1342005. doi: 10.1142/S0219720013420055. Epub 2013 Oct 2.

引用本文的文献

1
DISCO: Species Tree Inference using Multicopy Gene Family Tree Decomposition.利用多拷贝基因家族树分解进行种系树推断。
Syst Biol. 2022 Apr 19;71(3):610-629. doi: 10.1093/sysbio/syab070.
2
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.ASTRAL-Pro:基于四重奏的系统发生树推断,即便存在基因重复。
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307. doi: 10.1093/molbev/msaa139.
3
FastMulRFS: fast and accurate species tree estimation under generic gene duplication and loss models.FastMulRFS:在通用的基因复制和缺失模型下快速准确的物种树估计。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i57-i65. doi: 10.1093/bioinformatics/btaa444.
4
Phylogenetic tree building in the genomic age.基因组时代的系统发育树构建。
Nat Rev Genet. 2020 Jul;21(7):428-444. doi: 10.1038/s41576-020-0233-0. Epub 2020 May 18.
5
MIPhy: identify and quantify rapidly evolving members of large gene families.MIPhy:识别和量化大型基因家族中快速进化的成员。
PeerJ. 2018 May 29;6:e4873. doi: 10.7717/peerj.4873. eCollection 2018.