贝叶斯选择核苷酸替换模型及其位点分配。

Bayesian selection of nucleotide substitution models and their site assignments.

机构信息

Department of Computer Science, University of Auckland, Auckland, New Zealand.

出版信息

Mol Biol Evol. 2013 Mar;30(3):669-88. doi: 10.1093/molbev/mss258. Epub 2012 Dec 11.

DOI:10.1093/molbev/mss258

PMID:23233462

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3563969/

Abstract

Probabilistic inference of a phylogenetic tree from molecular sequence data is predicated on a substitution model describing the relative rates of change between character states along the tree for each site in the multiple sequence alignment. Commonly, one assumes that the substitution model is homogeneous across sites within large partitions of the alignment, assigns these partitions a priori, and then fixes their underlying substitution model to the best-fitting model from a hierarchy of named models. Here, we introduce an automatic model selection and model averaging approach within a Bayesian framework that simultaneously estimates the number of partitions, the assignment of sites to partitions, the substitution model for each partition, and the uncertainty in these selections. This new approach is implemented as an add-on to the BEAST 2 software platform. We find that this approach dramatically improves the fit of the nucleotide substitution model compared with existing approaches, and we show, using a number of example data sets, that as many as nine partitions are required to explain the heterogeneity in nucleotide substitution process across sites in a single gene analysis. In some instances, this improved modeling of the substitution process can have a measurable effect on downstream inference, including the estimated phylogeny, relative divergence times, and effective population size histories.

摘要

从分子序列数据中推断系统发育树的概率取决于替代模型，该模型描述了在多重序列比对中每个位置的树状特征状态之间的变化相对速率。通常，人们假设替代模型在比对的大分区内是均匀的，先验地分配这些分区，然后将它们的基础替代模型固定为从命名模型层次结构中最佳拟合的模型。在这里，我们在贝叶斯框架内引入了一种自动模型选择和模型平均方法，该方法同时估计分区的数量、站点到分区的分配、每个分区的替代模型以及这些选择的不确定性。这种新方法是作为 BEAST 2 软件平台的附加组件实现的。我们发现，与现有方法相比，这种方法大大提高了核苷酸替代模型的拟合度，并且我们使用一些示例数据集表明，在单个基因分析中，多达九个分区是解释核苷酸替代过程中各位置异质性所必需的。在某些情况下，这种对替代过程的改进建模可以对下游推断产生可衡量的影响，包括估计的系统发育、相对分歧时间和有效种群大小历史。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8365/3563969/89cabed09f61/mss258f1p.jpg

相似文献

Bayesian selection of nucleotide substitution models and their site assignments.

Mol Biol Evol. 2013 Mar;30(3):669-88. doi: 10.1093/molbev/mss258. Epub 2012 Dec 11.

Bayesian coestimation of phylogeny and sequence alignment.

BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

Markov-Modulated Continuous-Time Markov Chains to Identify Site- and Branch-Specific Evolutionary Variation in BEAST.

Syst Biol. 2021 Jan 1;70(1):181-189. doi: 10.1093/sysbio/syaa037.

bModelTest: Bayesian phylogenetic site model averaging and model comparison.

BMC Evol Biol. 2017 Feb 6;17(1):42. doi: 10.1186/s12862-017-0890-6.

Bayesian random local clocks, or one rate to rule them all.

BMC Biol. 2010 Aug 31;8:114. doi: 10.1186/1741-7007-8-114.

A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model.

Mol Biol Evol. 2007 Jun;24(6):1286-99. doi: 10.1093/molbev/msm046. Epub 2007 Mar 7.

Site-specific evolutionary rate inference: taking phylogenetic uncertainty into account.

J Mol Evol. 2005 Mar;60(3):345-53. doi: 10.1007/s00239-004-0183-8.

Bayesian phylogenetic analysis of combined data.

Syst Biol. 2004 Feb;53(1):47-67. doi: 10.1080/10635150490264699.

The devil in the details: interactions between the branch-length prior and likelihood model affect node support and branch lengths in the phylogeny of the Psoraceae.

Syst Biol. 2011 Jul;60(4):541-61. doi: 10.1093/sysbio/syr022. Epub 2011 Mar 24.

Model averaging and Bayes factor calculation of relaxed molecular clocks in Bayesian phylogenetics.

Mol Biol Evol. 2012 Feb;29(2):751-61. doi: 10.1093/molbev/msr232. Epub 2011 Sep 22.

引用本文的文献

Ancient trans-species polymorphism at the Major Histocompatibility Complex in primates.

Elife. 2025 Sep 12;14:RP103547. doi: 10.7554/eLife.103547.

Infinite Mixture Models for Improved Modeling of Across-Site Evolutionary Variation.

Mol Biol Evol. 2025 Jul 30;42(8). doi: 10.1093/molbev/msaf199.

The Primate Major Histocompatibility Complex: An Illustrative Example of Gene Family Evolution.

bioRxiv. 2024 Sep 18:2024.09.16.613318. doi: 10.1101/2024.09.16.613318.

Toward a Semi-Supervised Learning Approach to Phylogenetic Estimation.

Syst Biol. 2024 Oct 30;73(5):789-806. doi: 10.1093/sysbio/syae029.

OBAMA: OBAMA for Bayesian amino-acid model averaging.

PeerJ. 2020 Aug 4;8:e9460. doi: 10.7717/peerj.9460. eCollection 2020.

Current Affairs of Microbial Genome-Wide Association Studies: Approaches, Bottlenecks and Analytical Pitfalls.

Front Microbiol. 2020 Jan 30;10:3119. doi: 10.3389/fmicb.2019.03119. eCollection 2019.

BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis.

PLoS Comput Biol. 2019 Apr 8;15(4):e1006650. doi: 10.1371/journal.pcbi.1006650. eCollection 2019 Apr.

Information Criteria for Comparing Partition Schemes.

Syst Biol. 2018 Jul 1;67(4):616-632. doi: 10.1093/sysbio/syx097.

Evolutionary dynamics of language systems.

Proc Natl Acad Sci U S A. 2017 Oct 17;114(42):E8822-E8829. doi: 10.1073/pnas.1700388114. Epub 2017 Oct 4.

Insights into intercontinental spread of Zika virus.

PLoS One. 2017 Apr 27;12(4):e0176710. doi: 10.1371/journal.pone.0176710. eCollection 2017.

本文引用的文献

MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space.

Syst Biol. 2012 May;61(3):539-42. doi: 10.1093/sysbio/sys029. Epub 2012 Feb 22.

Partitionfinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses.

Mol Biol Evol. 2012 Jun;29(6):1695-701. doi: 10.1093/molbev/mss020. Epub 2012 Jan 20.

Purifying selection can obscure the ancient age of viral lineages.

Mol Biol Evol. 2011 Dec;28(12):3355-65. doi: 10.1093/molbev/msr170. Epub 2011 Jun 24.

The mode and tempo of hepatitis C virus evolution within and among hosts.

BMC Evol Biol. 2011 May 19;11:131. doi: 10.1186/1471-2148-11-131.

Joint inference of microsatellite mutation models, population history and genealogies using transdimensional Markov Chain Monte Carlo.

Genetics. 2011 May;188(1):151-64. doi: 10.1534/genetics.110.125260. Epub 2011 Mar 8.

Among-site rate variation and its impact on phylogenetic analyses.

Trends Ecol Evol. 1996 Sep;11(9):367-72. doi: 10.1016/0169-5347(96)10041-0.

New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0.

Syst Biol. 2010 May;59(3):307-21. doi: 10.1093/sysbio/syq010. Epub 2010 Mar 29.

Bayesian inference of species trees from multilocus data.

Mol Biol Evol. 2010 Mar;27(3):570-80. doi: 10.1093/molbev/msp274. Epub 2009 Nov 11.

Bayesian phylogeography finds its roots.

PLoS Comput Biol. 2009 Sep;5(9):e1000520. doi: 10.1371/journal.pcbi.1000520. Epub 2009 Sep 25.

Bayesian analysis of amino acid substitution models.

Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3941-53. doi: 10.1098/rstb.2008.0175.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

贝叶斯选择核苷酸替换模型及其位点分配。

Bayesian selection of nucleotide substitution models and their site assignments.

机构信息

Department of Computer Science, University of Auckland, Auckland, New Zealand.

出版信息

Mol Biol Evol. 2013 Mar;30(3):669-88. doi: 10.1093/molbev/mss258. Epub 2012 Dec 11.

DOI:10.1093/molbev/mss258

PMID:23233462

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3563969/

Abstract

摘要

贝叶斯选择核苷酸替换模型及其位点分配。

Bayesian selection of nucleotide substitution models and their site assignments.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

贝叶斯选择核苷酸替换模型及其位点分配。

Bayesian selection of nucleotide substitution models and their site assignments.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献