• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于北美的松树的多基因数据集对两阶段种系树推断策略的实证评估。

An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines.

机构信息

Department of Biology, Pennsylvania State University, University Park, PA 16802, USA.

出版信息

BMC Evol Biol. 2014 Mar 29;14:67. doi: 10.1186/1471-2148-14-67.

DOI:10.1186/1471-2148-14-67
PMID:24678701
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4021425/
Abstract

BACKGROUND

As it becomes increasingly possible to obtain DNA sequences of orthologous genes from diverse sets of taxa, species trees are frequently being inferred from multilocus data. However, the behavior of many methods for performing this inference has remained largely unexplored. Some methods have been proven to be consistent given certain evolutionary models, whereas others rely on criteria that, although appropriate for many parameter values, have peculiar zones of the parameter space in which they fail to converge on the correct estimate as data sets increase in size.

RESULTS

Here, using North American pines, we empirically evaluate the behavior of 24 strategies for species tree inference using three alternative outgroups (72 strategies total). The data consist of 120 individuals sampled in eight ingroup species from subsection Strobus and three outgroup species from subsection Gerardianae, spanning ∼47 kilobases of sequence at 121 loci. Each "strategy" for inferring species trees consists of three features: a species tree construction method, a gene tree inference method, and a choice of outgroup. We use multivariate analysis techniques such as principal components analysis and hierarchical clustering to identify tree characteristics that are robustly observed across strategies, as well as to identify groups of strategies that produce trees with similar features. We find that strategies that construct species trees using only topological information cluster together and that strategies that use additional non-topological information (e.g., branch lengths) also cluster together. Strategies that utilize more than one individual within a species to infer gene trees tend to produce estimates of species trees that contain clades present in trees estimated by other strategies. Strategies that use the minimize-deep-coalescences criterion to construct species trees tend to produce species tree estimates that contain clades that are not present in trees estimated by the Concatenation, RTC, SMRT, STAR, and STEAC methods, and that in general are more balanced than those inferred by these other strategies.

CONCLUSIONS

When constructing a species tree from a multilocus set of sequences, our observations provide a basis for interpreting differences in species tree estimates obtained via different approaches that have a two-stage structure in common, one step for gene tree estimation and a second step for species tree estimation. The methods explored here employ a number of distinct features of the data, and our analysis suggests that recovery of the same results from multiple methods that tend to differ in their patterns of inference can be a valuable tool for obtaining reliable estimates.

摘要

背景

随着从不同分类单元中获得同源基因的 DNA 序列变得越来越可能,物种树经常根据多点数据进行推断。然而,许多执行这种推断的方法的行为在很大程度上仍未得到探索。一些方法在某些进化模型下已被证明是一致的,而其他方法则依赖于标准,尽管这些标准对于许多参数值是合适的,但在参数空间的特殊区域中,当数据量增加时,它们无法收敛到正确的估计值。

结果

在这里,我们使用北美松树,通过使用三个替代外群(总共 72 种策略),从经验上评估了 24 种用于物种树推断的策略的行为。数据由 8 个种内物种的 120 个个体组成,来自 Strobus 亚科,以及来自 Gerardianae 亚科的 3 个外群物种,跨越 121 个基因座的约 47 千碱基序列。每种用于推断物种树的“策略”都由三个特征组成:一种物种树构建方法、一种基因树推断方法和一种外群选择。我们使用多元分析技术,如主成分分析和层次聚类,来识别在策略中稳健观察到的树特征,以及识别产生具有相似特征的树的策略组。我们发现,仅使用拓扑信息构建物种树的策略聚集在一起,并且使用额外的非拓扑信息(例如,分支长度)的策略也聚集在一起。使用一个物种内的多个个体来推断基因树的策略往往会产生包含其他策略估计的树中存在的分支的物种树估计。使用最小化深聚结准则构建物种树的策略往往会产生包含不在Concatenation、RTC、SMRT、STAR 和 STEAC 方法估计的树中存在的分支的物种树估计,并且通常比其他策略推断的更平衡。

结论

当从一组多点序列构建物种树时,我们的观察结果为通过具有共同两步结构的不同方法获得的物种树估计值之间的差异提供了一个解释基础,一个步骤用于基因树估计,另一个步骤用于物种树估计。这里探索的方法采用了数据的许多不同特征,我们的分析表明,从倾向于在推断模式上有所不同的多个方法中获得相同的结果,可以作为获得可靠估计值的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/a3f68880475d/1471-2148-14-67-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/fd19634e8c6e/1471-2148-14-67-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6e21f2eb428b/1471-2148-14-67-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6c8dcdb06673/1471-2148-14-67-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/8939f22bb44d/1471-2148-14-67-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/fc79b22b91cc/1471-2148-14-67-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6312bc95e9b0/1471-2148-14-67-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/f8cd2476eb07/1471-2148-14-67-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/06bdcb61f5b9/1471-2148-14-67-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/a3f68880475d/1471-2148-14-67-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/fd19634e8c6e/1471-2148-14-67-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6e21f2eb428b/1471-2148-14-67-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6c8dcdb06673/1471-2148-14-67-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/8939f22bb44d/1471-2148-14-67-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/fc79b22b91cc/1471-2148-14-67-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/6312bc95e9b0/1471-2148-14-67-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/f8cd2476eb07/1471-2148-14-67-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/06bdcb61f5b9/1471-2148-14-67-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5052/4021425/a3f68880475d/1471-2148-14-67-9.jpg

相似文献

1
An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines.基于北美的松树的多基因数据集对两阶段种系树推断策略的实证评估。
BMC Evol Biol. 2014 Mar 29;14:67. doi: 10.1186/1471-2148-14-67.
2
The gene tree delusion.基因树错觉
Mol Phylogenet Evol. 2016 Jan;94(Pt A):1-33. doi: 10.1016/j.ympev.2015.07.018. Epub 2015 Jul 31.
3
Comparing species tree estimation with large anchored phylogenomic and small Sanger-sequenced molecular datasets: an empirical study on Malagasy pseudoxyrhophiine snakes.比较大型锚定系统发育基因组学和小型桑格测序分子数据集的物种树估计:马达加斯加伪蝰蛇的实证研究
BMC Evol Biol. 2015 Oct 12;15:221. doi: 10.1186/s12862-015-0503-1.
4
Multilocus analyses reveal little evidence for lineage-wide adaptive evolution within major clades of soft pines (Pinus subgenus Strobus).多基因座分析显示,在软松属(松亚属)的主要分支中,几乎没有证据表明存在全谱系适应性进化。
Mol Ecol. 2013 Nov;22(22):5635-50. doi: 10.1111/mec.12514. Epub 2013 Oct 18.
5
A new fast method for inferring multiple consensus trees using k-medoids.一种利用 k -medoids 快速推断多个一致树的新方法。
BMC Evol Biol. 2018 Apr 5;18(1):48. doi: 10.1186/s12862-018-1163-8.
6
Non-monophyly and intricate morphological evolution within the avian family Cettiidae revealed by multilocus analysis of a taxonomically densely sampled dataset.多基因分析揭示了鸟类鹟科家族内的非单系性和复杂的形态进化,该数据集在分类上采样密集。
BMC Evol Biol. 2011 Dec 5;11:352. doi: 10.1186/1471-2148-11-352.
7
Species tree discordance traces to phylogeographic clade boundaries in North American fence lizards (Sceloporus).物种树的不和谐可追溯到北美的栅栏蜥蜴(Sceloporus)的系统地理分支边界。
Syst Biol. 2009 Dec;58(6):547-59. doi: 10.1093/sysbio/syp057. Epub 2009 Sep 21.
8
Theoretical and Practical Considerations when using Retroelement Insertions to Estimate Species Trees in the Anomaly Zone.在异常区域使用逆转录元件插入来估计物种树时的理论与实践考量
Syst Biol. 2022 Apr 19;71(3):721-740. doi: 10.1093/sysbio/syab086.
9
Gene Tree Estimation Error with Ultraconserved Elements: An Empirical Study on Pseudapis Bees.基于超保守元件的基因树估计误差:对拟蜜蜂属的实证研究。
Syst Biol. 2021 Jun 16;70(4):803-821. doi: 10.1093/sysbio/syaa097.
10
Comparison of methods for species-tree inference in the sawfly genus Neodiprion (Hymenoptera: Diprionidae).叶蜂属(膜翅目:松叶蜂科)物种树推断方法的比较
Syst Biol. 2008 Dec;57(6):876-90. doi: 10.1080/10635150802580949.

引用本文的文献

1
Statistical inconsistency of the unrooted minimize deep coalescence criterion.无根最小深度融合准则的统计不一致性。
PLoS One. 2021 May 10;16(5):e0251107. doi: 10.1371/journal.pone.0251107. eCollection 2021.
2
Maximum Likelihood Estimation of Species Trees from Gene Trees in the Presence of Ancestral Population Structure.存在祖先群体结构时,从基因树上估计物种树的最大似然法。
Genome Biol Evol. 2020 Feb 1;12(2):3977-3995. doi: 10.1093/gbe/evaa022.
3
Inferring rooted species trees from unrooted gene trees using approximate Bayesian computation.

本文引用的文献

1
SEARCHING FOR EVOLUTIONARY PATTERNS IN THE SHAPE OF A PHYLOGENETIC TREE.探寻系统发育树形状中的进化模式。
Evolution. 1993 Aug;47(4):1171-1181. doi: 10.1111/j.1558-5646.1993.tb02144.x.
2
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP.系统发育树的置信区间:一种使用自展法的方法。
Evolution. 1985 Jul;39(4):783-791. doi: 10.1111/j.1558-5646.1985.tb00420.x.
3
Robustness to divergence time underestimation when inferring species trees from estimated gene trees.从估计的基因树推断种系树时,对分歧时间低估的稳健性。
使用近似贝叶斯计算从未根基因树推断有根物种树。
Mol Phylogenet Evol. 2017 Nov;116:13-24. doi: 10.1016/j.ympev.2017.07.017. Epub 2017 Aug 2.
4
Sequence of the Sugar Pine Megagenome.糖松巨基因组序列
Genetics. 2016 Dec;204(4):1613-1626. doi: 10.1534/genetics.116.193227. Epub 2016 Oct 28.
5
Targeted Capture Sequencing in Whitebark Pine Reveals Range-Wide Demographic and Adaptive Patterns Despite Challenges of a Large, Repetitive Genome.尽管白皮松基因组庞大且重复,靶向捕获测序仍揭示了其全分布范围的种群统计学和适应性模式。
Front Plant Sci. 2016 Apr 21;7:484. doi: 10.3389/fpls.2016.00484. eCollection 2016.
6
Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?基因树不一致能否解释宏观进化模型与树形和分支时间的实证模式之间的不匹配?
Syst Biol. 2016 Jul;65(4):628-39. doi: 10.1093/sysbio/syw019. Epub 2016 Mar 11.
Syst Biol. 2014 Jan 1;63(1):66-82. doi: 10.1093/sysbio/syt059. Epub 2013 Aug 29.
4
State-of the art methodologies dictate new standards for phylogenetic analysis.最先进的方法为系统发育分析制定了新标准。
BMC Evol Biol. 2013 Aug 1;13:161. doi: 10.1186/1471-2148-13-161.
5
Mathematical properties of the deep coalescence cost.深合并成本的数学性质。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):61-72. doi: 10.1109/TCBB.2012.133.
6
Inferring ancient divergences requires genes with strong phylogenetic signals.推断古代分歧需要具有强烈系统发育信号的基因。
Nature. 2013 May 16;497(7449):327-31. doi: 10.1038/nature12130. Epub 2013 May 8.
7
A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing.基于通过靶向富集和高通量测序收集的超过 1500 个基因座的鸟类系统发育。
PLoS One. 2013;8(1):e54848. doi: 10.1371/journal.pone.0054848. Epub 2013 Jan 29.
8
Resolving conflict in eutherian mammal phylogeny using phylogenomics and the multispecies coalescent model.利用系统基因组学和多物种合并模型解决真兽类哺乳动物系统发育中的冲突。
Proc Natl Acad Sci U S A. 2012 Sep 11;109(37):14942-7. doi: 10.1073/pnas.1211733109. Epub 2012 Aug 28.
9
Improvements to a class of distance matrix methods for inferring species trees from gene trees.从基因树推断物种树的一类距离矩阵方法的改进。
J Comput Biol. 2012 Jun;19(6):632-49. doi: 10.1089/cmb.2012.0042.
10
Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood.基于最大似然法的不完全谱系分选下基于基因树拓扑结构的合并种系树推断。
Evolution. 2012 Mar;66(3):763-775. doi: 10.1111/j.1558-5646.2011.01476.x. Epub 2011 Nov 2.