• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统发育重建异质过程的陷阱。

Pitfalls of heterogeneous processes for phylogenetic reconstruction.

作者信息

Stefankovic Daniel, Vigoda Eric

机构信息

Department of Computer Science, University of Rochester, Rochester, New York 14627, USA.

出版信息

Syst Biol. 2007 Feb;56(1):113-24. doi: 10.1080/10635150701245388.

DOI:10.1080/10635150701245388
PMID:17366141
Abstract

Different genes often have different phylogenetic histories. Even within regions having the same phylogenetic history, the mutation rates often vary. We investigate the prospects of phylogenetic reconstruction when all the characters are generated from the same tree topology, but the branch lengths vary (with possibly different tree shapes). Furthering work of Kolaczkowski and Thornton (2004, Nature 431: 980-984) and Chang (1996, Math. Biosci. 134: 189-216), we show examples where maximum likelihood (under a homogeneous model) is an inconsistent estimator of the tree. We then explore the prospects of phylogenetic inference under a heterogeneous model. In some models, there are examples where phylogenetic inference under any method is impossible - despite the fact that there is a common tree topology. In particular, there are nonidentifiable mixture distributions, i.e., multiple topologies generate identical mixture distributions. We address which evolutionary models have nonidentifiable mixture distributions and prove that the following duality theorem holds for most DNA substitution models. The model has either: (i) nonidentifiability - two different tree topologies can produce identical mixture distributions, and hence distinguishing between the two topologies is impossible; or (ii) linear tests - there exist linear tests which identify the common tree topology for character data generated by a mixture distribution. The theorem holds for models whose transition matrices can be parameterized by open sets, which includes most of the popular models, such as Tamura-Nei and Kimura's 2-parameter model. The duality theorem relies on our notion of linear tests, which are related to Lake's linear invariants.

摘要

不同的基因往往具有不同的系统发育历史。即使在具有相同系统发育历史的区域内,突变率也常常有所不同。我们研究当所有特征都由相同的树拓扑结构生成,但分支长度不同(可能具有不同的树形)时进行系统发育重建的前景。在扩展了科拉茨科夫斯基和桑顿(2004年,《自然》431:980 - 984)以及张(1996年,《数学生物学》134:189 - 216)的工作基础上,我们展示了一些例子,其中最大似然法(在齐次模型下)是树的不一致估计量。然后我们探索在非齐次模型下进行系统发育推断的前景。在某些模型中,存在这样的例子,即尽管存在共同的树拓扑结构,但任何方法都无法进行系统发育推断。特别是,存在不可识别的混合分布,即多种拓扑结构会产生相同的混合分布。我们探讨哪些进化模型具有不可识别的混合分布,并证明以下对偶定理对大多数DNA替换模型成立。该模型要么:(i)具有不可识别性——两种不同的树拓扑结构可以产生相同的混合分布,因此无法区分这两种拓扑结构;要么(ii)具有线性检验——存在线性检验可以识别由混合分布生成的特征数据的共同树拓扑结构。该定理适用于其转移矩阵可以由开集参数化的模型,这包括大多数流行的模型,如田村 - 内模型和木村二参数模型。对偶定理依赖于我们的线性检验概念,它与莱克的线性不变量相关。

相似文献

1
Pitfalls of heterogeneous processes for phylogenetic reconstruction.系统发育重建异质过程的陷阱。
Syst Biol. 2007 Feb;56(1):113-24. doi: 10.1080/10635150701245388.
2
Phylogeny of mixture models: robustness of maximum likelihood and non-identifiable distributions.混合模型的系统发育:最大似然法的稳健性与不可识别分布
J Comput Biol. 2007 Mar;14(2):156-89. doi: 10.1089/cmb.2006.0126.
3
Phylogenetic mixtures on a single tree can mimic a tree of another topology.单棵树上的系统发育混合可以模拟出具有另一种拓扑结构的树。
Syst Biol. 2007 Oct;56(5):767-75. doi: 10.1080/10635150701627304.
4
Maximum likelihood estimates of species trees: how accuracy of phylogenetic inference depends upon the divergence history and sampling design.最大似然估计物种树:系统发育推断的准确性如何取决于分歧历史和采样设计。
Syst Biol. 2009 Oct;58(5):501-8. doi: 10.1093/sysbio/syp045. Epub 2009 Aug 20.
5
Inconsistency of phylogenetic estimates from concatenated data under coalescence.合并模型下串联数据的系统发育估计的不一致性。
Syst Biol. 2007 Feb;56(1):17-24. doi: 10.1080/10635150601146041.
6
Artifactual phylogenies caused by correlated distribution of substitution rates among sites and lineages: the good, the bad, and the ugly.由位点和谱系间替换率的相关分布导致的人为系统发育树:好的、坏的和丑陋的。
Syst Biol. 2007 Feb;56(1):68-82. doi: 10.1080/10635150601175578.
7
Phylogenetic mixture models can reduce node-density artifacts.系统发育混合模型可以减少节点密度假象。
Syst Biol. 2008 Apr;57(2):286-93. doi: 10.1080/10635150802044045.
8
Fundamental differences between the methods of maximum likelihood and maximum posterior probability in phylogenetics.系统发育学中最大似然法与最大后验概率法的根本差异。
Syst Biol. 2006 Feb;55(1):116-21. doi: 10.1080/10635150500481648.
9
SPIn: model selection for phylogenetic mixtures via linear invariants.SPIn:通过线性不变量进行系统发育混合物的模型选择。
Mol Biol Evol. 2012 Mar;29(3):929-37. doi: 10.1093/molbev/msr259. Epub 2011 Oct 17.
10
What is the danger of the anomaly zone for empirical phylogenetics?异常区对经验系统发生学有何危险?
Syst Biol. 2009 Oct;58(5):527-36. doi: 10.1093/sysbio/syp047. Epub 2009 Aug 26.

引用本文的文献

1
Environmental dependence of genetic constraint.环境对遗传约束的影响。
PLoS Genet. 2013 Jun;9(6):e1003580. doi: 10.1371/journal.pgen.1003580. Epub 2013 Jun 27.
2
Identifiability and inference of non-parametric rates-across-sites models on large-scale phylogenies.大规模系统发育树上非参数跨位点速率模型的可识别性与推断
J Math Biol. 2013 Oct;67(4):767-97. doi: 10.1007/s00285-012-0571-4. Epub 2012 Aug 9.
3
An optimization-based sampling scheme for phylogenetic trees.一种基于优化的系统发育树抽样方案。
J Comput Biol. 2011 Nov;18(11):1599-609. doi: 10.1089/cmb.2011.0164. Epub 2011 Sep 27.
4
A late origin of the extant eukaryotic diversity: divergence time estimates using rare genomic changes.现存真核生物多样性的起源较晚:利用稀有基因组变化进行的分歧时间估计。
Biol Direct. 2011 May 19;6:26. doi: 10.1186/1745-6150-6-26.
5
Biomarkers in the age of omics: time for a systems biology approach.组学时代的生物标志物:是时候采用系统生物学方法了。
OMICS. 2011 Mar;15(3):105-12. doi: 10.1089/omi.2010.0023. Epub 2011 Feb 14.
6
Analysis of rare genomic changes does not support the unikont-bikont phylogeny and suggests cyanobacterial symbiosis as the point of primary radiation of eukaryotes.分析罕见的基因组变化并不支持真核生物的 unikont-bikont 系统发生关系,并且表明蓝藻共生是真核生物最初辐射的起点。
Genome Biol Evol. 2009 May 25;1:99-113. doi: 10.1093/gbe/evp011.
7
Evolution of a new function by degenerative mutation in cephalochordate steroid receptors.头索动物类固醇受体中退化性突变导致新功能的进化。
PLoS Genet. 2008 Sep 12;4(9):e1000191. doi: 10.1371/journal.pgen.1000191.
8
Evolutionary medicine: A meaningful connection between omics, disease, and treatment.进化医学:组学、疾病与治疗之间的有意义联系。
Proteomics Clin Appl. 2008 Feb;2(2):122-134. doi: 10.1002/prca.200780047.
9
A mixed branch length model of heterotachy improves phylogenetic accuracy.一种异速进化的混合分支长度模型提高了系统发育准确性。
Mol Biol Evol. 2008 Jun;25(6):1054-66. doi: 10.1093/molbev/msn042. Epub 2008 Mar 3.