• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

加速估计局地异质轮廓混合模型中的频率类。

Accelerated Estimation of Frequency Classes in Site-Heterogeneous Profile Mixture Models.

机构信息

Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada.

École Nationale Supérieure de Techniques Avancées, Palaiseau, France.

出版信息

Mol Biol Evol. 2018 May 1;35(5):1266-1283. doi: 10.1093/molbev/msy026.

DOI:10.1093/molbev/msy026
PMID:29688541
Abstract

As a consequence of structural and functional constraints, proteins tend to have site-specific preferences for particular amino acids. Failing to adjust for heterogeneity of frequencies over sites can lead to artifacts in phylogenetic estimation. Site-heterogeneous mixture-models have been developed to address this problem. However, due to prohibitive computational times, maximum likelihood implementations utilize fixed component frequency vectors inferred from sequences in a database that are external to the alignment under analysis. Here, we propose a composite likelihood approach to estimation of component frequencies for a mixture model that directly uses the data from the alignment of interest. In the common case that the number of taxa under study is not large, several adjustments to the default composite likelihood are shown to be necessary. In simulations, the approach is shown to provide large improvements over hierarchical clustering. For empirical data, substantial improvements in likelihoods are found over mixtures using fixed components.

摘要

由于结构和功能的限制,蛋白质往往对特定的氨基酸具有特定的偏好。如果不考虑位置上频率的异质性,可能会导致系统发育估计中的假象。已经开发了基于位置异质的混合模型来解决这个问题。然而,由于计算时间过长,最大似然实现利用了从分析比对之外的数据库序列中推断出的固定分量频率向量。在这里,我们提出了一种混合模型的分量频率估计的复合似然方法,该方法直接使用感兴趣的比对数据。在研究的分类单元数量不大的常见情况下,需要对默认的复合似然进行几种调整。在模拟中,该方法在聚类方面优于层次聚类。对于实际数据,与使用固定分量的混合物相比,似然度有了很大的提高。

相似文献

1
Accelerated Estimation of Frequency Classes in Site-Heterogeneous Profile Mixture Models.加速估计局地异质轮廓混合模型中的频率类。
Mol Biol Evol. 2018 May 1;35(5):1266-1283. doi: 10.1093/molbev/msy026.
2
Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation.利用后验均值位点频率分布模型化位点异质性可加速准确的系统基因组估计。
Syst Biol. 2018 Mar 1;67(2):216-235. doi: 10.1093/sysbio/syx068.
3
A class frequency mixture model that adjusts for site-specific amino acid frequencies and improves inference of protein phylogeny.一种根据特定位点氨基酸频率进行调整并改进蛋白质系统发育推断的类频率混合模型。
BMC Evol Biol. 2008 Dec 16;8:331. doi: 10.1186/1471-2148-8-331.
4
Is Over-parameterization a Problem for Profile Mixture Models?过参数化对轮廓混合模型是一个问题吗?
Syst Biol. 2024 May 27;73(1):53-75. doi: 10.1093/sysbio/syad063.
5
An amino acid substitution-selection model adjusts residue fitness to improve phylogenetic estimation.氨基酸替换选择模型调整残基适合度以改进系统发育估计。
Mol Biol Evol. 2014 Apr;31(4):779-92. doi: 10.1093/molbev/msu044. Epub 2014 Jan 16.
6
The Relative Importance of Modeling Site Pattern Heterogeneity Versus Partition-Wise Heterotachy in Phylogenomic Inference.系统发育基因组推断中模型化地点模式异质性与分区异速进化的相对重要性。
Syst Biol. 2019 Nov 1;68(6):1003-1019. doi: 10.1093/sysbio/syz021.
7
Phylogenetic mixture models for proteins.蛋白质的系统发育混合模型
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3965-76. doi: 10.1098/rstb.2008.0180.
8
GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.GTRpmix:一种用于轮廓混合模型的关联广义时间可逆模型。
Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae174.
9
Scalable Empirical Mixture Models That Account for Across-Site Compositional Heterogeneity.可扩展的经验混合模型,可解释跨站点组成异质性。
Mol Biol Evol. 2020 Dec 16;37(12):3616-3631. doi: 10.1093/molbev/msaa145.
10
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.

引用本文的文献

1
A robustly rooted tree of eukaryotes reveals their excavate ancestry.一棵根基稳固的真核生物进化树揭示了它们的古虫界祖先。
Nature. 2025 Apr;640(8060):974-981. doi: 10.1038/s41586-025-08709-5. Epub 2025 Mar 12.
2
Reconstructing the last common ancestor of all eukaryotes.重建所有真核生物的最后共同祖先。
PLoS Biol. 2024 Nov 25;22(11):e3002917. doi: 10.1371/journal.pbio.3002917. eCollection 2024 Nov.
3
GTRpmix: A Linked General Time-Reversible Model for Profile Mixture Models.GTRpmix:一种用于轮廓混合模型的关联广义时间可逆模型。
Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae174.
4
Phylogenetic reconciliation: making the most of genomes to understand microbial ecology and evolution.系统发育和解:充分利用基因组来理解微生物生态学与进化。
ISME J. 2024 Jan 8;18(1). doi: 10.1093/ismejo/wrae129.
5
Microbial Diversity and Open Questions about the Deep Tree of Life.微生物多样性和生命之树深处的未解问题。
Genome Biol Evol. 2024 Apr 2;16(4). doi: 10.1093/gbe/evae053.
6
Is Over-parameterization a Problem for Profile Mixture Models?过参数化对轮廓混合模型是一个问题吗?
Syst Biol. 2024 May 27;73(1):53-75. doi: 10.1093/sysbio/syad063.
7
Compositionally Constrained Sites Drive Long-Branch Attraction.组成受限的位点驱动长枝吸引。
Syst Biol. 2023 Aug 7;72(4):767-780. doi: 10.1093/sysbio/syad013.
8
Identifying the Best Approximating Model in Bayesian Phylogenetics: Bayes Factors, Cross-Validation or wAIC?贝叶斯系统发生学中最佳逼近模型的识别:贝叶斯因子、交叉验证还是 wAIC?
Syst Biol. 2023 Jun 17;72(3):616-638. doi: 10.1093/sysbio/syad004.
9
Bayesian Cross-Validation Comparison of Amino Acid Replacement Models: Contrasting Profile Mixtures, Pairwise Exchangeabilities, and Gamma-Distributed Rates-Across-Sites.贝叶斯交叉验证比较氨基酸替换模型:对比分布混合模型、成对可交换性模型和γ分布的位点间速率模型。
J Mol Evol. 2022 Dec;90(6):468-475. doi: 10.1007/s00239-022-10076-y. Epub 2022 Oct 7.
10
Site-and-branch-heterogeneous analyses of an expanded dataset favour mitochondria as sister to known Alphaproteobacteria.基于扩展数据集的种系发生分析支持线粒体是已知的α变形菌的姐妹群。
Nat Ecol Evol. 2022 Mar;6(3):253-262. doi: 10.1038/s41559-021-01638-2. Epub 2022 Jan 13.