• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用数据克隆评估系统发育模型中的参数可识别性。

Assessing parameter identifiability in phylogenetic models using data cloning.

机构信息

Department of Biology, University of Florida, Gainesville, FL 32611, USA.

出版信息

Syst Biol. 2012 Dec 1;61(6):955-72. doi: 10.1093/sysbio/sys055. Epub 2012 May 30.

DOI:10.1093/sysbio/sys055
PMID:22649181
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3478565/
Abstract

The success of model-based methods in phylogenetics has motivated much research aimed at generating new, biologically informative models. This new computer-intensive approach to phylogenetics demands validation studies and sound measures of performance. To date there has been little practical guidance available as to when and why the parameters in a particular model can be identified reliably. Here, we illustrate how Data Cloning (DC), a recently developed methodology to compute the maximum likelihood estimates along with their asymptotic variance, can be used to diagnose structural parameter nonidentifiability (NI) and distinguish it from other parameter estimability problems, including when parameters are structurally identifiable, but are not estimable in a given data set (INE), and when parameters are identifiable, and estimable, but only weakly so (WE). The application of the DC theorem uses well-known and widely used Bayesian computational techniques. With the DC approach, practitioners can use Bayesian phylogenetics software to diagnose nonidentifiability. Theoreticians and practitioners alike now have a powerful, yet simple tool to detect nonidentifiability while investigating complex modeling scenarios, where getting closed-form expressions in a probabilistic study is complicated. Furthermore, here we also show how DC can be used as a tool to examine and eliminate the influence of the priors, in particular if the process of prior elicitation is not straightforward. Finally, when applied to phylogenetic inference, DC can be used to study at least two important statistical questions: assessing identifiability of discrete parameters, like the tree topology, and developing efficient sampling methods for computationally expensive posterior densities.

摘要

基于模型的方法在系统发育学中的成功激发了大量旨在生成新的、具有生物学信息量的模型的研究。这种新的计算机密集型系统发育学方法需要验证研究和可靠的性能衡量标准。迄今为止,关于何时以及为何可以可靠地识别特定模型中的参数,几乎没有实际的指导。在这里,我们将说明如何使用最近开发的一种方法 Data Cloning(DC)来计算最大似然估计及其渐近方差,以诊断结构参数不可识别性(NI),并将其与其他参数可估计性问题区分开来,包括参数在结构上可识别但在给定数据集中不可估计(INE)的情况,以及参数在结构上可识别且可估计但仅微弱可估计(WE)的情况。DC 定理的应用使用了众所周知且广泛使用的贝叶斯计算技术。通过使用 DC 方法,从业者可以使用贝叶斯系统发育学软件来诊断不可识别性。理论家和从业者现在都有了一个强大而简单的工具,可以在调查复杂的建模场景时检测不可识别性,在这种情况下,在概率研究中获得闭式表达式是很复杂的。此外,我们还展示了如何将 DC 用作一种工具来检查和消除先验的影响,特别是如果先验的启发过程不直接。最后,当应用于系统发育推断时,DC 可用于研究至少两个重要的统计问题:评估离散参数(如树拓扑)的可识别性,以及为计算昂贵的后验密度开发有效的抽样方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/dad400a096cf/sys055f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/1f769bbce7b3/sys055f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/b25b44a9b9ef/sys055f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/93e6d69e2399/sys055f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/31d81082d62f/sys055f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/dad400a096cf/sys055f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/1f769bbce7b3/sys055f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/b25b44a9b9ef/sys055f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/93e6d69e2399/sys055f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/31d81082d62f/sys055f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/33bc/3478565/dad400a096cf/sys055f5.jpg

相似文献

1
Assessing parameter identifiability in phylogenetic models using data cloning.使用数据克隆评估系统发育模型中的参数可识别性。
Syst Biol. 2012 Dec 1;61(6):955-72. doi: 10.1093/sysbio/sys055. Epub 2012 May 30.
2
Estimating trees from filtered data: identifiability of models for morphological phylogenetics.从过滤数据中估算树木:形态系统发生学模型的可识别性。
J Theor Biol. 2010 Mar 7;263(1):108-19. doi: 10.1016/j.jtbi.2009.12.001. Epub 2009 Dec 11.
3
Recognizing Structural Nonidentifiability: When Experiments Do Not Provide Information About Important Parameters and Misleading Models Can Still Have Great Fit.识别结构不可识别性:当实验不能提供有关重要参数的信息且误导性模型仍能拟合得很好时。
Risk Anal. 2020 Feb;40(2):352-369. doi: 10.1111/risa.13386. Epub 2019 Aug 23.
4
Genealogical Working Distributions for Bayesian Model Testing with Phylogenetic Uncertainty.用于具有系统发育不确定性的贝叶斯模型检验的系谱工作分布
Syst Biol. 2016 Mar;65(2):250-64. doi: 10.1093/sysbio/syv083. Epub 2015 Nov 1.
5
19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology.19 种计算系统发育树拓扑结构边际似然的可疑方法。
Syst Biol. 2020 Mar 1;69(2):209-220. doi: 10.1093/sysbio/syz046.
6
Robustness of compound Dirichlet priors for Bayesian inference of branch lengths.复合狄利克雷先验对支长贝叶斯推断稳健性的研究。
Syst Biol. 2012 Oct;61(5):779-84. doi: 10.1093/sysbio/sys030. Epub 2012 Feb 10.
7
A novel Bayesian method for inferring and interpreting the dynamics of adaptive landscapes from phylogenetic comparative data.一种用于从系统发育比较数据推断和解释适应性景观动态的新型贝叶斯方法。
Syst Biol. 2014 Nov;63(6):902-18. doi: 10.1093/sysbio/syu057. Epub 2014 Jul 30.
8
Probabilistic graphical model representation in phylogenetics.系统发生学中的概率图形模型表示。
Syst Biol. 2014 Sep;63(5):753-71. doi: 10.1093/sysbio/syu039. Epub 2014 Jun 20.
9
Deflating trees: improving Bayesian branch-length estimates using informed priors.瘪树法:使用有信息先验的方法改进贝叶斯分支长度估计。
Syst Biol. 2015 May;64(3):441-7. doi: 10.1093/sysbio/syv003. Epub 2015 Jan 16.
10
Bayesian Analyses of Comparative Data with the Ornstein-Uhlenbeck Model: Potential Pitfalls.贝叶斯分析具有 Ornstein-Uhlenbeck 模型的比较数据:潜在的陷阱。
Syst Biol. 2022 Oct 12;71(6):1524-1540. doi: 10.1093/sysbio/syac036.

引用本文的文献

1
Practical Consequences of the Bias in the Laplace Approximation to Marginal Likelihood for Hierarchical Models.层次模型中边际似然拉普拉斯近似偏差的实际后果。
Entropy (Basel). 2025 Mar 11;27(3):289. doi: 10.3390/e27030289.
2
Quantifying the abundance and survival rates of island-associated spinner dolphins using a multi-state open robust design model.利用多状态开放式鲁棒设计模型来量化岛屿相关飞旋海豚的丰度和存活率。
Sci Rep. 2024 Jun 26;14(1):14764. doi: 10.1038/s41598-024-64220-3.
3
Evidence of an Absence of Inbreeding Depression in a Wild Population of Weddell Seals ().

本文引用的文献

1
Computational Tools for Evaluating Phylogenetic and Hierarchical Clustering Trees.用于评估系统发育树和层次聚类树的计算工具
J Comput Graph Stat. 2012;21(3):581-599. doi: 10.1080/10618600.2012.640901. Epub 2012 Aug 16.
2
Efficient Interpolation of Computationally Expensive Posterior Densities With Variable Parameter Costs.具有可变参数成本的计算昂贵后验密度的高效插值
J Comput Graph Stat. 2011;20(3):636-655. doi: 10.1198/jcgs.2011.09212. Epub 2012 Jan 24.
3
Tail paradox, partial identifiability, and influential priors in Bayesian branch length inference.
威德尔海豹野生种群不存在近亲繁殖衰退的证据()。
Entropy (Basel). 2023 Feb 22;25(3):403. doi: 10.3390/e25030403.
4
Model misspecification misleads inference of the spatial dynamics of disease outbreaks.模型误设定误导了疾病爆发空间动态的推断。
Proc Natl Acad Sci U S A. 2023 Mar 14;120(11):e2213913120. doi: 10.1073/pnas.2213913120. Epub 2023 Mar 10.
5
PrioriTree: a utility for improving phylodynamic analyses in BEAST.PrioriTree:一个用于改进 BEAST 系统发育动力学分析的工具。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac849.
6
Entropy, Statistical Evidence, and Scientific Inference: Evidence Functions in Theory and Applications.《熵、统计证据与科学推理:理论与应用中的证据函数》
Entropy (Basel). 2022 Sep 9;24(9):1273. doi: 10.3390/e24091273.
7
Pulled Diversification Rates, Lineages-Through-Time Plots, and Modern Macroevolutionary Modeling.拉动多样化率、谱系时间图和现代宏观进化模型。
Syst Biol. 2022 Apr 19;71(3):758-773. doi: 10.1093/sysbio/syab083.
8
Model Projections in Model Space: A Geometric Interpretation of the AIC Allows Estimating the Distance Between Truth and Approximating Models.模型空间中的模型投影:AIC的几何解释有助于估计真实模型与近似模型之间的距离。
Front Ecol Evol. 2019 Nov;7. doi: 10.3389/fevo.2019.00413. Epub 2019 Nov 8.
9
Coalescence modeling of intrainfection populations allows estimation of infection parameters in wild populations.群体内感染的合并模型可用于估计野生种群中的感染参数。
Proc Natl Acad Sci U S A. 2020 Feb 25;117(8):4273-4280. doi: 10.1073/pnas.1920790117. Epub 2020 Feb 13.
10
Strong Evidence for an Intraspecific Metabolic Scaling Coefficient Near 0.89 in Fish.鱼类种内代谢标度系数接近0.89的有力证据。
Front Physiol. 2019 Sep 20;10:1166. doi: 10.3389/fphys.2019.01166. eCollection 2019.
贝叶斯分支长度推断中的尾部悖论、部分可识别性和有影响的先验。
Mol Biol Evol. 2012 Jan;29(1):325-35. doi: 10.1093/molbev/msr210. Epub 2011 Sep 2.
4
Generalized mixture models for molecular phylogenetic estimation.广义混合模型在分子系统发育估计中的应用。
Syst Biol. 2012 Jan;61(1):12-21. doi: 10.1093/sysbio/syr093. Epub 2011 Aug 26.
5
Turtle isochore structure is intermediate between amphibians and other amniotes.龟类等距结构介于两栖动物和其他羊膜动物之间。
Integr Comp Biol. 2008 Oct;48(4):454-62. doi: 10.1093/icb/icn062. Epub 2008 Jun 24.
6
On Rogers' proof of identifiability for the GTR + Γ + I model.关于罗杰斯对广义相对论+Γ+I模型可识别性的证明。
Syst Biol. 2011 Oct;60(5):713-8. doi: 10.1093/sysbio/syr023. Epub 2011 Mar 28.
7
History can matter: non-Markovian behavior of ancestral lineages.历史很重要:祖系的非马尔可夫行为。
Syst Biol. 2011 May;60(3):276-90. doi: 10.1093/sysbio/syr012. Epub 2011 Mar 11.
8
Amino acid compositional shifts during streptophyte transitions to terrestrial habitats.在石松类植物向陆地栖息地过渡过程中氨基酸组成的变化。
J Mol Evol. 2011 Feb;72(2):204-14. doi: 10.1007/s00239-010-9416-1. Epub 2010 Dec 14.
9
A fast algorithm for computing geodesic distances in tree space.一种用于计算树空间测地距离的快速算法。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):2-13. doi: 10.1109/TCBB.2010.3.
10
When trees grow too long: investigating the causes of highly inaccurate bayesian branch-length estimates.当树木生长过长时:探究贝叶斯分支长度估计高度不准确的原因。
Syst Biol. 2010 Mar;59(2):145-61. doi: 10.1093/sysbio/syp081. Epub 2009 Dec 10.