• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于似然法的系统发育分析方法中的无星号偏差和参数估计偏差。

Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method.

作者信息

Xia Xuhua

机构信息

Department of Biology, University of Ottawa, Ottawa, Canada, K1N 6N5.

Ottawa Institute of Systems Biology, Ottawa, Canada, K1H 8M5.

出版信息

AIMS Genet. 2019 Apr 9;5(4):212-223. doi: 10.3934/genet.2018.4.212. eCollection 2018.

DOI:10.3934/genet.2018.4.212
PMID:31435522
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6690233/
Abstract

I analyzed various site pattern combinations in a 4-OTU case to identify sources of starless bias and parameter-estimation bias in likelihood-based phylogenetic methods, and reported three significant contributions. First, the likelihood method is counterintuitive in that it may not generate a star tree with sequences that are equidistant from each other. This behaviour, dubbed starless bias, happens in a 4-OTU tree when there is an excess (i.e., more than expected from a star tree and a substitution model) of conflicting phylogenetic signals supporting the three resolved topologies equally. Special site pattern combinations leading to rejection of a star tree, when sequences are equidistant from each other, were identified. Second, fitting gamma distribution to model rate heterogeneity over sites is strongly confounded with tree topology, especially in conjunction with the starless bias. I present examples to show dramatic differences in the estimated shape parameter α between a star tree and a resolved tree. There may be no rate heterogeneity over sites (with the estimated α > 10000) when a star tree is imposed, but α < 1 (suggesting strong rate heterogeneity over sites) when an (incorrect) resolved tree is imposed. Thus, the dependence of "rate heterogeneity" on tree topology implies that "rate heterogeneity" is not a sequence-specific feature, cautioning against interpreting a small α to mean that some sites are under strong purifying selection and others not. Thirdly, because there is no existing (and working) likelihood method for evaluating a star tree with continuous gamma-distributed rate, I have implemented the method for JC69 in a self-contained R script for a four-OTU tree (star or resolved), in addition to another R script assuming a constant rate over sites. These R scripts should be useful for teaching and exploring likelihood methods in phylogenetics.

摘要

我分析了一个包含4个OTU的案例中的各种位点模式组合,以识别基于似然法的系统发育方法中无星型偏差和参数估计偏差的来源,并报告了三项重要发现。首先,似然法违反直觉,因为它可能不会用彼此距离相等的序列生成星型树。这种行为被称为无星型偏差,当支持三种解析拓扑结构的冲突系统发育信号过多(即,比星型树和替换模型预期的更多)时,在4个OTU的树中就会出现这种情况。当序列彼此距离相等时,识别出了导致星型树被拒绝的特殊位点模式组合。其次,用伽马分布来模拟位点间的速率异质性与树拓扑结构密切相关,特别是与无星型偏差相结合时。我给出了例子来说明星型树和解析树之间估计的形状参数α存在显著差异。当强加星型树时,位点间可能不存在速率异质性(估计的α>10000),但当强加一个(错误的)解析树时,α<1(表明位点间存在强烈的速率异质性)。因此,“速率异质性”对树拓扑结构的依赖性意味着“速率异质性”不是序列特异性特征,这提醒我们不要将小的α解释为某些位点受到强烈的纯化选择而其他位点没有。第三,由于目前没有用于评估具有连续伽马分布速率的星型树的似然法,我除了编写了另一个假设位点上速率恒定的R脚本外,还在一个自包含的R脚本中为四OTU树(星型或解析型)实现了JC69方法。这些R脚本对于在系统发育学中教授和探索似然法应该是有用的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/4f01c307c98a/genetics-05-04-212-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/15dee3fce51d/genetics-05-04-212-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/62c1e3751216/genetics-05-04-212-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/4f01c307c98a/genetics-05-04-212-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/15dee3fce51d/genetics-05-04-212-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/62c1e3751216/genetics-05-04-212-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4acc/6690233/4f01c307c98a/genetics-05-04-212-g003.jpg

相似文献

1
Starless bias and parameter-estimation bias in the likelihood-based phylogenetic method.基于似然法的系统发育分析方法中的无星号偏差和参数估计偏差。
AIMS Genet. 2019 Apr 9;5(4):212-223. doi: 10.3934/genet.2018.4.212. eCollection 2018.
2
When being "most likely" is not enough: examining the performance of three uses of the parametric bootstrap in phylogenetics.当“最有可能”还不够时:考察系统发育学中参数自助法三种用法的性能。
J Mol Evol. 2003 Feb;56(2):198-222. doi: 10.1007/s00239-002-2394-1.
3
Large sample approximations of probabilities of correct evolutionary tree estimation and biases of maximum likelihood estimation.正确进化树估计概率的大样本近似以及最大似然估计的偏差
Stat Appl Genet Mol Biol. 2011;10:Article 10. doi: 10.2202/1544-6115.1626.
4
Cases in which ancestral maximum likelihood will be confusingly misleading.祖先最大似然法会产生令人困惑的误导的情况。
J Theor Biol. 2017 May 7;420:318-323. doi: 10.1016/j.jtbi.2017.03.001. Epub 2017 Mar 2.
5
Fair-balance paradox, star-tree paradox, and Bayesian phylogenetics.公平平衡悖论、星树悖论与贝叶斯系统发育学
Mol Biol Evol. 2007 Aug;24(8):1639-55. doi: 10.1093/molbev/msm081. Epub 2007 May 7.
6
Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous.当进化具有异质性时最大简约法和似然法系统发育分析的性能
Nature. 2004 Oct 21;431(7011):980-4. doi: 10.1038/nature02917.
7
Quartet-mapping, a generalization of the likelihood-mapping procedure.四重映射,似然映射程序的一种推广。
Mol Biol Evol. 2001 Jul;18(7):1204-19. doi: 10.1093/oxfordjournals.molbev.a003907.
8
Phylogenetic analysis using parsimony and likelihood methods.使用简约法和似然法进行系统发育分析。
J Mol Evol. 1996 Feb;42(2):294-307. doi: 10.1007/BF02198856.
9
Substitution model of sequence evolution for the human immunodeficiency virus type 1 subtype B gp120 gene over the C2-V5 region.人类免疫缺陷病毒1型B亚型gp120基因在C2-V5区域的序列进化替代模型。
J Mol Evol. 2001 Jul;53(1):55-62. doi: 10.1007/s002390010192.
10
Short Tree, Long Tree, Right Tree, Wrong Tree: New Acquisition Bias Corrections for Inferring SNP Phylogenies.短树、长树、正确树、错误树:用于推断单核苷酸多态性系统发育的新获取偏差校正方法
Syst Biol. 2015 Nov;64(6):1032-47. doi: 10.1093/sysbio/syv053. Epub 2015 Jul 29.

引用本文的文献

1
Major Revisions in Arthropod Phylogeny Through Improved Supermatrix, With Support for Two Possible Waves of Land Invasion by Chelicerates.通过改进的超矩阵对节肢动物系统发育进行重大修订,支持螯肢动物两次可能的陆地入侵浪潮。
Evol Bioinform Online. 2020 Feb 5;16:1176934320903735. doi: 10.1177/1176934320903735. eCollection 2020.

本文引用的文献

1
DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution.DAMBE7:用于分子生物学和进化数据分析的新改进工具。
Mol Biol Evol. 2018 Jun 1;35(6):1550-1552. doi: 10.1093/molbev/msy073.
2
New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.新算法和方法估计最大似然系统发育:评估 PhyML 3.0 的性能。
Syst Biol. 2010 May;59(3):307-21. doi: 10.1093/sysbio/syq010. Epub 2010 Mar 29.
3
PAML 4: phylogenetic analysis by maximum likelihood.PAML 4:基于最大似然法的系统发育分析。
Mol Biol Evol. 2007 Aug;24(8):1586-91. doi: 10.1093/molbev/msm088. Epub 2007 May 4.
4
Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting.系统发育推断的平衡最小进化方法的理论基础及其与加权最小二乘树拟合的关系。
Mol Biol Evol. 2004 Mar;21(3):587-98. doi: 10.1093/molbev/msh049. Epub 2003 Dec 23.
5
Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees.人类和黑猩猩线粒体DNA控制区域核苷酸替换数目的估计。
Mol Biol Evol. 1993 May;10(3):512-26. doi: 10.1093/oxfordjournals.molbev.a040023.
6
A new method for calculating evolutionary substitution rates.一种计算进化替代率的新方法。
J Mol Evol. 1984;20(1):86-93. doi: 10.1007/BF02101990.
7
Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.通过线粒体DNA分子钟确定人类与猿类的分化时间。
J Mol Evol. 1985;22(2):160-74. doi: 10.1007/BF02101694.
8
The neighbor-joining method: a new method for reconstructing phylogenetic trees.邻接法:一种重建系统发育树的新方法。
Mol Biol Evol. 1987 Jul;4(4):406-25. doi: 10.1093/oxfordjournals.molbev.a040454.
9
Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoidea.从DNA序列数据评估进化树拓扑结构的最大似然估计,以及人猿总科中的分支顺序。
J Mol Evol. 1989 Aug;29(2):170-9. doi: 10.1007/BF02100115.
10
Heterogeneity of tempo and mode of mitochondrial DNA evolution among mammalian orders.哺乳动物各目线粒体DNA进化的速度和模式的异质性。
Jpn J Genet. 1989 Aug;64(4):243-58. doi: 10.1266/jjg.64.243.