• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

贝叶斯交叉验证比较氨基酸替换模型:对比分布混合模型、成对可交换性模型和γ分布的位点间速率模型。

Bayesian Cross-Validation Comparison of Amino Acid Replacement Models: Contrasting Profile Mixtures, Pairwise Exchangeabilities, and Gamma-Distributed Rates-Across-Sites.

机构信息

Department of Biology, Institute of Biochemistry, Carleton University, Ottawa, Canada.

School of Mathematics and Statistics, Carleton University, 209 Nesbitt Biology Building, 1125 Colonel By Drive, Ottawa, ON, K1A 0C6, Canada.

出版信息

J Mol Evol. 2022 Dec;90(6):468-475. doi: 10.1007/s00239-022-10076-y. Epub 2022 Oct 7.

DOI:10.1007/s00239-022-10076-y
PMID:36207534
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9643205/
Abstract

Models of amino acid replacement are central to modern phylogenetic inference, particularly so when dealing with deep evolutionary relationships. Traditionally, a single, empirically derived matrix was utilized, so as to keep the degrees-of-freedom of the inference low, and focused on topology. With the growing size of data sets, however, an amino acid-level general-time-reversible matrix has become increasingly feasible, treating amino acid exchangeabilities and frequencies as free parameters. Moreover, models based on mixtures of multiple matrices are increasingly utilized, in order to account for across-site heterogeneities in amino acid requirements of proteins. Such models exist as finite empirically-derived amino acid profile (or frequency) mixtures, free finite mixtures, as well as free Dirichlet process-based infinite mixtures. All of these approaches are typically combined with a gamma-distributed rates-across-sites model. In spite of the availability of these different aspects to modeling the amino acid replacement process, no study has systematically quantified their relative contributions to their predictive power of real data. Here, we use Bayesian cross-validation to establish a detailed comparison, while activating/deactivating each modeling aspect. For most data sets studied, we find that amino acid mixture models can outrank all single-matrix models, even when the latter include gamma-distributed rates and the former do not. We also find that free finite mixtures consistently outperform empirical finite mixtures. Finally, the Dirichlet process-based mixture model tends to outperform all other approaches.

摘要

氨基酸替换模型是现代系统发育推断的核心,尤其是在处理深度进化关系时。传统上,使用单个经验衍生的矩阵来保持推断的自由度低,并专注于拓扑结构。然而,随着数据集规模的不断增长,氨基酸水平的一般时间可逆矩阵变得越来越可行,将氨基酸的可交换性和频率视为自由参数。此外,基于多种矩阵混合物的模型也越来越多地被利用,以解释蛋白质中氨基酸需求的跨位点异质性。这些模型存在有限的经验衍生的氨基酸特征(或频率)混合物、自由有限混合物以及自由 Dirichlet 过程基于无限混合物。所有这些方法通常都与伽马分布的速率-站点模型相结合。尽管有这些不同的方面来模拟氨基酸替换过程,但没有研究系统地量化它们对真实数据预测能力的相对贡献。在这里,我们使用贝叶斯交叉验证来建立一个详细的比较,同时激活/停用每个建模方面。对于我们研究的大多数数据集,我们发现氨基酸混合物模型可以优于所有单矩阵模型,即使后者包括伽马分布的速率,而前者不包括。我们还发现自由有限混合物始终优于经验有限混合物。最后,基于 Dirichlet 过程的混合物模型往往优于其他所有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a33b/9643205/73b2d5deb82f/239_2022_10076_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a33b/9643205/73b2d5deb82f/239_2022_10076_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a33b/9643205/73b2d5deb82f/239_2022_10076_Fig1_HTML.jpg

相似文献

1
Bayesian Cross-Validation Comparison of Amino Acid Replacement Models: Contrasting Profile Mixtures, Pairwise Exchangeabilities, and Gamma-Distributed Rates-Across-Sites.贝叶斯交叉验证比较氨基酸替换模型:对比分布混合模型、成对可交换性模型和γ分布的位点间速率模型。
J Mol Evol. 2022 Dec;90(6):468-475. doi: 10.1007/s00239-022-10076-y. Epub 2022 Oct 7.
2
GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.GTRpmix:一种用于轮廓混合模型的关联广义时间可逆模型。
Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae174.
3
PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.PhyloBayes MPI:在并行环境中使用分布的无限混合进行系统发育重建。
Syst Biol. 2013 Jul;62(4):611-5. doi: 10.1093/sysbio/syt022. Epub 2013 Apr 5.
4
A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny.一种根据特定位点氨基酸频率进行调整并改进蛋白质系统发育推断的类频率混合模型。
BMC Evol Biol. 2008 Dec 16;8:331. doi: 10.1186/1471-2148-8-331.
5
Measuring the relative contribution to predictive power of modern nucleotide substitution modeling approaches.衡量现代核苷酸取代建模方法对预测能力的相对贡献。
Bioinform Adv. 2023 Jul 14;3(1):vbad091. doi: 10.1093/bioadv/vbad091. eCollection 2023.
6
Is Over-parameterization a Problem for Profile Mixture Models?过参数化对轮廓混合模型是一个问题吗?
Syst Biol. 2024 May 27;73(1):53-75. doi: 10.1093/sysbio/syad063.
7
A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process.一种用于氨基酸替换过程中跨位点异质性的贝叶斯混合模型。
Mol Biol Evol. 2004 Jun;21(6):1095-109. doi: 10.1093/molbev/msh112. Epub 2004 Mar 10.
8
Compositional adjustment of Dirichlet mixture priors.狄利克雷混合先验的成分调整。
J Comput Biol. 2010 Dec;17(12):1607-20. doi: 10.1089/cmb.2010.0117.
9
Phylogenetic mixture models for proteins.蛋白质的系统发育混合模型
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3965-76. doi: 10.1098/rstb.2008.0180.
10
Stochastic Variational Inference for Bayesian Phylogenetics: A Case of CAT Model.贝叶斯系统发生学的随机变分推断:CAT 模型案例。
Mol Biol Evol. 2019 Apr 1;36(4):825-833. doi: 10.1093/molbev/msz020.

引用本文的文献

1
Stochastic Character Mapping: An Under-Exploited Approach to the Study of Molecular Evolution.随机特征映射:一种尚未充分利用的分子进化研究方法。
J Mol Evol. 2025 Aug;93(4):465-473. doi: 10.1007/s00239-025-10257-5. Epub 2025 Jul 8.
2
Modeling compositional heterogeneity resolves deep phylogeny of flowering plants.构建成分异质性模型解析开花植物的深层系统发育关系。
Plant Divers. 2024 Jul 23;47(1):13-20. doi: 10.1016/j.pld.2024.07.007. eCollection 2025 Jan.
3
Ant backbone phylogeny resolved by modelling compositional heterogeneity among sites in genomic data.

本文引用的文献

1
Rooting the Animal Tree of Life.构建动物生命之树。
Mol Biol Evol. 2021 Sep 27;38(10):4322-4333. doi: 10.1093/molbev/msab170.
2
Evidence for sponges as sister to all other animals from partitioned phylogenomics with mixture models and recoding.基于混合模型和重新编码的分区系统发育基因组学得出的海绵是所有其他动物姐妹群的证据。
Nat Commun. 2021 Mar 19;12(1):1783. doi: 10.1038/s41467-021-22074-7.
3
Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity.可扩展的经验混合模型,可解释跨站点组成异质性。
基于基因组数据中位点组成异质性建模解析的蚂蚁系统发育。
Commun Biol. 2024 Jan 17;7(1):106. doi: 10.1038/s42003-024-05793-7.
4
Measuring the relative contribution to predictive power of modern nucleotide substitution modeling approaches.衡量现代核苷酸取代建模方法对预测能力的相对贡献。
Bioinform Adv. 2023 Jul 14;3(1):vbad091. doi: 10.1093/bioadv/vbad091. eCollection 2023.
5
Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?贝叶斯系统发生学中最佳逼近模型的识别:贝叶斯因子、交叉验证还是 wAIC?
Syst Biol. 2023 Jun 17;72(3):616-638. doi: 10.1093/sysbio/syad004.
6
Resolving tricky nodes in the tree of life through amino acid recoding.通过氨基酸重新编码解决生命之树中棘手的节点问题。
iScience. 2022 Nov 15;25(12):105594. doi: 10.1016/j.isci.2022.105594. eCollection 2022 Dec 22.
Mol Biol Evol. 2020 Dec 16;37(12):3616-3631. doi: 10.1093/molbev/msaa145.
4
Relative Model Fit Does Not Predict Topological Accuracy in Single-Gene Protein Phylogenetics.相对模型拟合度不能预测单基因蛋白质系统发生的拓扑准确性。
Mol Biol Evol. 2020 Jul 1;37(7):2110-2123. doi: 10.1093/molbev/msaa075.
5
The Relative Importance of Modeling Site Pattern Heterogeneity Versus Partition-Wise Heterotachy in Phylogenomic Inference.系统发育基因组推断中模型化地点模式异质性与分区异速进化的相对重要性。
Syst Biol. 2019 Nov 1;68(6):1003-1019. doi: 10.1093/sysbio/syz021.
6
Accelerated Estimation of Frequency Classes in Site-Heterogeneous Profile Mixture Models.加速估计局地异质轮廓混合模型中的频率类。
Mol Biol Evol. 2018 May 1;35(5):1266-1283. doi: 10.1093/molbev/msy026.
7
Improved Modeling of Compositional Heterogeneity Supports Sponges as Sister to All Other Animals.成分异质性建模的改进支持海绵动物为所有其他动物的姐妹群。
Curr Biol. 2017 Dec 18;27(24):3864-3870.e4. doi: 10.1016/j.cub.2017.11.008. Epub 2017 Nov 30.
8
Phylogenomics demonstrates that breviate flagellates are related to opisthokonts and apusomonads.系统发生基因组学表明,短鞭毛生物与后口动物和动吻动物有关。
Proc Biol Sci. 2013 Aug 28;280(1769):20131755. doi: 10.1098/rspb.2013.1755. Print 2013 Oct 22.
9
Multi-locus phylogenetic analysis reveals the pattern and tempo of bony fish evolution.多位点系统发育分析揭示了硬骨鱼进化的模式和节奏。
PLoS Curr. 2013 Apr 16;5:ecurrents.tol.2ca8041495ffafd0c92756e75247483e. doi: 10.1371/currents.tol.2ca8041495ffafd0c92756e75247483e.
10
PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.PhyloBayes MPI:在并行环境中使用分布的无限混合进行系统发育重建。
Syst Biol. 2013 Jul;62(4):611-5. doi: 10.1093/sysbio/syt022. Epub 2013 Apr 5.