• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于位点速率的几种氨基酸替换矩阵来模拟蛋白质进化。

Modeling protein evolution with several amino acid replacement matrices depending on site rates.

机构信息

Méthodes et Algorithmes pour la Bioinformatique (LIRMM & IBC), Centre National de la Recherche Scientifique (CNRS)-Université Montpellier II, Montpellier Cedex 5, France.

出版信息

Mol Biol Evol. 2012 Oct;29(10):2921-36. doi: 10.1093/molbev/mss112. Epub 2012 Apr 6.

DOI:10.1093/molbev/mss112
PMID:22491036
Abstract

Most protein substitution models use a single amino acid replacement matrix summarizing the biochemical properties of amino acids. However, site evolution is highly heterogeneous and depends on many factors that influence the substitution patterns. In this paper, we investigate the use of different substitution matrices for different site evolutionary rates. Indeed, the variability of evolutionary rates corresponds to one of the most apparent heterogeneity factors among sites, and there is no reason to assume that the substitution patterns remain identical regardless of the evolutionary rate. We first introduce LG4M, which is composed of four matrices, each corresponding to one discrete gamma rate category (of four). These matrices differ in their amino acid equilibrium distributions and in their exchangeabilities, contrary to the standard gamma model where only the global rate differs from one category to another. Next, we present LG4X, which also uses four different matrices, but leaves aside the gamma distribution and follows a distribution-free scheme for the site rates. All these matrices are estimated from a very large alignment database, and our two models are tested using a large sample of independent alignments. Detailed analysis of resulting matrices and models shows the complexity of amino acid substitutions and the advantage of flexible models such as LG4M and LG4X. Both significantly outperform single-matrix models, providing gains of dozens to hundreds of log-likelihood units for most data sets. LG4X obtains substantial gains compared with LG4M, thanks to its distribution-free scheme for site rates. Since LG4M and LG4X display such advantages but require the same memory space and have comparable running times to standard models, we believe that LG4M and LG4X are relevant alternatives to single replacement matrices. Our models, data, and software are available from http://www.atgc-montpellier.fr/models/lg4x.

摘要

大多数蛋白质替换模型使用单个氨基酸替换矩阵来总结氨基酸的生化特性。然而,位点进化高度异质化,取决于许多影响替换模式的因素。在本文中,我们研究了为不同的位点进化率使用不同的替换矩阵。实际上,进化率的可变性对应于位点之间最明显的异质化因素之一,没有理由假设替换模式保持不变,而不管进化率如何。我们首先引入 LG4M,它由四个矩阵组成,每个矩阵对应于离散伽马率类别之一(共四个)。这些矩阵在氨基酸平衡分布和可交换性方面有所不同,与标准的伽马模型不同,标准的伽马模型仅在全局速率上因类别而异。接下来,我们介绍 LG4X,它也使用四个不同的矩阵,但不考虑伽马分布,而是采用无分布方案来处理位点速率。所有这些矩阵都是从一个非常大的比对数据库中估计得到的,我们的两个模型使用大量独立的比对进行了测试。对所得矩阵和模型的详细分析表明了氨基酸替换的复杂性以及 LG4M 和 LG4X 等灵活模型的优势。与单矩阵模型相比,这两种模型都显著提高了对数似然值,对于大多数数据集,提高了几十到几百个单位。由于 LG4X 采用了无分布的位点速率方案,因此与 LG4M 相比,它获得了实质性的增益。由于 LG4M 和 LG4X 具有这些优势,但需要相同的内存空间并且运行时间与标准模型相当,因此我们认为 LG4M 和 LG4X 是单替换矩阵的相关替代方案。我们的模型、数据和软件可从 http://www.atgc-montpellier.fr/models/lg4x 获得。

相似文献

1
Modeling protein evolution with several amino acid replacement matrices depending on site rates.基于位点速率的几种氨基酸替换矩阵来模拟蛋白质进化。
Mol Biol Evol. 2012 Oct;29(10):2921-36. doi: 10.1093/molbev/mss112. Epub 2012 Apr 6.
2
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.
3
Phylogenetic mixture models for proteins.蛋白质的系统发育混合模型
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 27;363(1512):3965-76. doi: 10.1098/rstb.2008.0180.
4
QMix: An Efficient Program to Automatically Estimate Multi-Matrix Mixture Models for Amino Acid Substitution Process.QMix:一种用于自动估计氨基酸替换过程的多矩阵混合模型的高效程序。
J Comput Biol. 2024 Aug;31(8):703-707. doi: 10.1089/cmb.2023.0403. Epub 2024 Jun 11.
5
A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank.一种具有独立位点的蛋白质进化模型,可从蛋白质数据库中重现位点特异性氨基酸分布。
BMC Evol Biol. 2006 May 31;6:43. doi: 10.1186/1471-2148-6-43.
6
Improving phylogenetic inference with a semiempirical amino acid substitution model.用半经验氨基酸替换模型改进系统发育推断。
Mol Biol Evol. 2013 Feb;30(2):469-79. doi: 10.1093/molbev/mss229. Epub 2012 Sep 21.
7
A new formulation of protein evolutionary models that account for structural constraints.一种新的蛋白质进化模型公式,该公式考虑了结构约束。
Mol Biol Evol. 2014 Mar;31(3):736-49. doi: 10.1093/molbev/mst240. Epub 2013 Dec 3.
8
An amino acid substitution-selection model adjusts residue fitness to improve phylogenetic estimation.氨基酸替换选择模型调整残基适合度以改进系统发育估计。
Mol Biol Evol. 2014 Apr;31(4):779-92. doi: 10.1093/molbev/msu044. Epub 2014 Jan 16.
9
Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial.在蛋白质系统发生学中考虑溶剂可及性和二级结构显然是有益的。
Syst Biol. 2010 May;59(3):277-87. doi: 10.1093/sysbio/syq002. Epub 2010 Mar 10.
10
A new criterion and method for amino acid classification.一种新的氨基酸分类标准和方法。
J Theor Biol. 2004 May 7;228(1):97-106. doi: 10.1016/j.jtbi.2003.12.010.

引用本文的文献

1
Ultrafast classical phylogenetic method beats large protein language models on variant effect prediction.超快经典系统发育方法在变异效应预测方面胜过大型蛋白质语言模型。
Adv Neural Inf Process Syst. 2024;37:130265-130290.
2
nT4X and nT4M: Novel Time Non-reversible Mixture Amino Acid Substitution Models.nT4X和nT4M:新型时间不可逆混合氨基酸取代模型。
J Mol Evol. 2025 Feb;93(1):136-148. doi: 10.1007/s00239-024-10230-8. Epub 2025 Jan 20.
3
MixtureFinder: Estimating DNA Mixture Models for Phylogenetic Analyses.混合体查找器:用于系统发育分析的DNA混合模型估计
Mol Biol Evol. 2025 Jan 6;42(1). doi: 10.1093/molbev/msae264.
4
A unique symbiosome in an anaerobic single-celled eukaryote.一种独特的共生体存在于厌氧单细胞真核生物中。
Nat Commun. 2024 Nov 9;15(1):9726. doi: 10.1038/s41467-024-54102-7.
5
The effectiveness of selection in a species affects the direction of amino acid frequency evolution.物种中选择的有效性会影响氨基酸频率进化的方向。
bioRxiv. 2024 Jun 22:2023.02.01.526552. doi: 10.1101/2023.02.01.526552.
6
Phylogenomics resolves the higher-level phylogeny of herbivorous eriophyoid mites (Acariformes: Eriophyoidea).系统发生基因组学解决了食草粉螨(蜱螨目:粉螨总科)的高级系统发育关系。
BMC Biol. 2024 Mar 22;22(1):70. doi: 10.1186/s12915-024-01870-9.
7
Accurate Detection of Convergent Mutations in Large Protein Alignments With ConDor.利用 ConDor 准确检测大型蛋白质比对中的会聚突变。
Genome Biol Evol. 2024 Apr 2;16(4). doi: 10.1093/gbe/evae040.
8
MAST: Phylogenetic Inference with Mixtures Across Sites and Trees.MAST:跨越站点和树的混合系统发育推断。
Syst Biol. 2024 Jul 27;73(2):375-391. doi: 10.1093/sysbio/syae008.
9
Ant backbone phylogeny resolved by modelling compositional heterogeneity among sites in genomic data.基于基因组数据中位点组成异质性建模解析的蚂蚁系统发育。
Commun Biol. 2024 Jan 17;7(1):106. doi: 10.1038/s42003-024-05793-7.
10
Phylogenomics reveals an almost perfect polytomy among the almost ungulates ().系统发育基因组学揭示了近有蹄类动物之间几乎完美的多歧分枝现象()。
bioRxiv. 2023 Dec 8:2023.12.07.570590. doi: 10.1101/2023.12.07.570590.