• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于哺乳动物序列主成分分析的新半经验密码子替换模型。

A new semiempirical codon substitution model based on principal component analysis of mammalian sequences.

机构信息

Department of Computer Science, Eidgenössische Technische Hochschule Zurich, Zürich, Switzerland.

出版信息

Mol Biol Evol. 2012 Jan;29(1):271-7. doi: 10.1093/molbev/msr198. Epub 2011 Aug 11.

DOI:10.1093/molbev/msr198
PMID:21836183
Abstract

Codon substitution models have traditionally been parametric Markov models, but recently, empirical and semiempirical models also have been proposed. Parametric codon models are typically based on 61×61 rate matrices that are derived from a small number of parameters. These parameters are rooted in experience and theoretical considerations and generally show good performance but are still relatively arbitrary. We have previously used principal component analysis (PCA) on data obtained from mammalian sequence alignments to empirically identify the most relevant parameters for codon substitution models, thereby confirming some commonly used parameters but also suggesting new ones. Here, we present a new semiempirical codon substitution model that is directly based on those PCA results. The substitution rate matrix is constructed from linear combinations of the first few (the most important) principal components with the coefficients being free model parameters. Thus, the model is not only based on empirical rates but also uses the empirically determined most relevant parameters for a codon model to adjust to the particularities of individual data sets. In comparisons against established parametric and semiempirical models, the new model consistently achieves the highest likelihood values when applied to sequences of vertebrates, which include the taxonomic class where the model was trained on.

摘要

密码子替换模型传统上是参数马尔可夫模型,但最近也提出了经验和半经验模型。参数密码子模型通常基于从少数参数中得出的 61×61 速率矩阵。这些参数源于经验和理论考虑,通常表现良好,但仍然相对任意。我们之前使用主成分分析(PCA)对从哺乳动物序列比对中获得的数据进行分析,以从经验上确定密码子替换模型最相关的参数,从而确认了一些常用的参数,但也提出了一些新的参数。在这里,我们提出了一种新的半经验密码子替换模型,该模型直接基于这些 PCA 结果。替换率矩阵是由前几个(最重要的)主成分的线性组合构建的,系数是自由模型参数。因此,该模型不仅基于经验速率,还使用经验确定的最相关的密码子模型参数来适应特定数据集的特殊性。在与已建立的参数和半经验模型的比较中,当应用于包括模型训练的分类单元的脊椎动物序列时,新模型始终获得最高的似然值。

相似文献

1
A new semiempirical codon substitution model based on principal component analysis of mammalian sequences.一种基于哺乳动物序列主成分分析的新半经验密码子替换模型。
Mol Biol Evol. 2012 Jan;29(1):271-7. doi: 10.1093/molbev/msr198. Epub 2011 Aug 11.
2
Improving phylogenetic inference with a semiempirical amino acid substitution model.用半经验氨基酸替换模型改进系统发育推断。
Mol Biol Evol. 2013 Feb;30(2):469-79. doi: 10.1093/molbev/mss229. Epub 2012 Sep 21.
3
Empirical analysis of the most relevant parameters of codon substitution models.密码子替换模型最相关参数的实证分析。
J Mol Evol. 2010 Jun;70(6):605-12. doi: 10.1007/s00239-010-9356-9. Epub 2010 Jun 5.
4
Empirical codon substitution matrix.经验密码子替换矩阵。
BMC Bioinformatics. 2005 Jun 1;6:134. doi: 10.1186/1471-2105-6-134.
5
Synonymous substitutions substantially improve evolutionary inference from highly diverged proteins.同义替换显著改善了从高度分化的蛋白质进行的进化推断。
Syst Biol. 2008 Jun;57(3):367-77. doi: 10.1080/10635150802158670.
6
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.
7
A combined empirical and mechanistic codon model.一种经验与机制相结合的密码子模型。
Mol Biol Evol. 2007 Feb;24(2):388-97. doi: 10.1093/molbev/msl175. Epub 2006 Nov 16.
8
Inferring complex DNA substitution processes on phylogenies using uniformization and data augmentation.利用均匀化和数据增强在系统发育树上推断复杂的DNA替代过程。
Syst Biol. 2006 Apr;55(2):259-69. doi: 10.1080/10635150500541599.
9
Investigating protein-coding sequence evolution with probabilistic codon substitution models.使用概率密码子替换模型研究蛋白质编码序列的进化。
Mol Biol Evol. 2009 Feb;26(2):255-71. doi: 10.1093/molbev/msn232. Epub 2008 Oct 14.
10
Evolutionary model selection with a genetic algorithm: a case study using stem RNA.基于遗传算法的进化模型选择:以茎RNA为例的案例研究
Mol Biol Evol. 2007 Jan;24(1):159-70. doi: 10.1093/molbev/msl144. Epub 2006 Oct 12.

引用本文的文献

1
Next-generation development and application of codon model in evolution.密码子模型在进化中的下一代发展与应用。
Front Genet. 2023 Jan 27;14:1091575. doi: 10.3389/fgene.2023.1091575. eCollection 2023.
2
Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes.额外的安打:对瞬时多核苷酸变化的广泛实证支持。
PLoS One. 2021 Mar 12;16(3):e0248337. doi: 10.1371/journal.pone.0248337. eCollection 2021.
3
Improved inference of site-specific positive selection under a generalized parametric codon model when there are multinucleotide mutations and multiple nonsynonymous rates.
当存在多核苷酸突变和多个非同义速率时,广义参数密码子模型下的位点特异性正选择的推断得到改进。
BMC Evol Biol. 2019 Jan 14;19(1):22. doi: 10.1186/s12862-018-1326-7.
4
A generalized mechanistic codon model.一个广义的机制性密码子模型。
Mol Biol Evol. 2014 Sep;31(9):2528-41. doi: 10.1093/molbev/msu196. Epub 2014 Jun 23.
5
Non-negative matrix factorization for learning alignment-specific models of protein evolution.非负矩阵分解用于学习蛋白质进化的对齐特异性模型。
PLoS One. 2011;6(12):e28898. doi: 10.1371/journal.pone.0028898. Epub 2011 Dec 22.