• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于遗传算法的进化模型选择:以茎RNA为例的案例研究

Evolutionary model selection with a genetic algorithm: a case study using stem RNA.

作者信息

Kosakovsky Pond Sergei L, Mannino Frank V, Gravenor Michael B, Muse Spencer V, Frost Simon D W

机构信息

Department of Pathology, University of California, San Diego, USA.

出版信息

Mol Biol Evol. 2007 Jan;24(1):159-70. doi: 10.1093/molbev/msl144. Epub 2006 Oct 12.

DOI:10.1093/molbev/msl144
PMID:17038448
Abstract

The choice of a probabilistic model to describe sequence evolution can and should be justified. Underfitting the data through the use of overly simplistic models may miss out on interesting phenomena and lead to incorrect inferences. Overfitting the data with models that are too complex may ascribe biological meaning to statistical artifacts and result in falsely significant findings. We describe a likelihood-based approach for evolutionary model selection. The procedure employs a genetic algorithm (GA) to quickly explore a combinatorially large set of all possible time-reversible Markov models with a fixed number of substitution rates. When applied to stem RNA data subject to well-understood evolutionary forces, the models found by the GA 1) capture the expected overall rate patterns a priori; 2) fit the data better than the best available models based on a priori assumptions, suggesting subtle substitution patterns not previously recognized; 3) cannot be rejected in favor of the general reversible model, implying that the evolution of stem RNA sequences can be explained well with only a few substitution rate parameters; and 4) perform well on simulated data, both in terms of goodness of fit and the ability to estimate evolutionary rates. We also investigate the utility of several distance measures for comparing and contrasting inferred evolutionary models. Using widely available small computer clusters, our approach allows, for the first time, to evaluate the performance of existing RNA evolutionary models by comparing them with a large pool of candidate models and to validate common modeling assumptions. In addition, the new method provides the foundation for rigorous selection and comparison of substitution models for other types of sequence data.

摘要

选择一个概率模型来描述序列进化是可以而且应该有充分理由的。使用过于简单的模型对数据拟合不足可能会错过有趣的现象并导致错误的推断。使用过于复杂的模型对数据过度拟合可能会将生物学意义归因于统计假象,并导致错误的显著结果。我们描述了一种基于似然性的进化模型选择方法。该程序采用遗传算法(GA)来快速探索具有固定替换率数量的所有可能的时间可逆马尔可夫模型的组合量大的集合。当应用于受到充分理解的进化力影响的茎RNA数据时,GA找到的模型:1)先验地捕捉预期的总体速率模式;2)比基于先验假设的最佳可用模型更好地拟合数据,表明存在以前未识别的微妙替换模式;3)不能被拒绝而支持一般可逆模型,这意味着仅用几个替换率参数就可以很好地解释茎RNA序列的进化;4)在模拟数据上表现良好,无论是在拟合优度还是估计进化速率的能力方面。我们还研究了几种距离度量在比较和对比推断的进化模型方面的效用。使用广泛可用的小型计算机集群,我们的方法首次允许通过将现有RNA进化模型与大量候选模型进行比较来评估其性能,并验证常见的建模假设。此外,新方法为严格选择和比较其他类型序列数据的替换模型提供了基础。

相似文献

1
Evolutionary model selection with a genetic algorithm: a case study using stem RNA.基于遗传算法的进化模型选择:以茎RNA为例的案例研究
Mol Biol Evol. 2007 Jan;24(1):159-70. doi: 10.1093/molbev/msl144. Epub 2006 Oct 12.
2
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.
3
Non-homogeneous models of sequence evolution in the Bio++ suite of libraries and programs.Bio++库和程序套件中序列进化的非齐次模型。
BMC Evol Biol. 2008 Sep 22;8:255. doi: 10.1186/1471-2148-8-255.
4
A model-based approach to study nearest-neighbor influences reveals complex substitution patterns in non-coding sequences.一种基于模型的方法用于研究最近邻影响,揭示了非编码序列中复杂的替代模式。
Syst Biol. 2008 Oct;57(5):675-92. doi: 10.1080/10635150802422324.
5
Evolutionary rate variation and RNA secondary structure prediction.进化速率变异与RNA二级结构预测
Comput Biol Chem. 2004 Jul;28(3):219-26. doi: 10.1016/j.compbiolchem.2004.04.001.
6
A "Long Indel" model for evolutionary sequence alignment.一种用于进化序列比对的“长插入缺失”模型。
Mol Biol Evol. 2004 Mar;21(3):529-40. doi: 10.1093/molbev/msh043. Epub 2003 Dec 23.
7
Likelihood-based clustering (LiBaC) for codon models, a method for grouping sites according to similarities in the underlying process of evolution.基于似然性的密码子模型聚类(LiBaC),一种根据潜在进化过程中的相似性对位点进行分组的方法。
Mol Biol Evol. 2008 Sep;25(9):1995-2007. doi: 10.1093/molbev/msn145. Epub 2008 Jun 26.
8
Pseudo-likelihood for non-reversible nucleotide substitution models with neighbour dependent rates.具有邻域依赖速率的不可逆核苷酸替换模型的伪似然度
Stat Appl Genet Mol Biol. 2006;5:Article18. doi: 10.2202/1544-6115.1217. Epub 2006 Jul 31.
9
Statistical alignment: computational properties, homology testing and goodness-of-fit.统计比对:计算属性、同源性检测与拟合优度
J Mol Biol. 2000 Sep 8;302(1):265-79. doi: 10.1006/jmbi.2000.4061.
10
Spatial and temporal heterogeneity in nucleotide sequence evolution.核苷酸序列进化中的时空异质性。
Mol Biol Evol. 2008 Aug;25(8):1683-94. doi: 10.1093/molbev/msn119. Epub 2008 May 22.

引用本文的文献

1
Developing and Applying RNA Empirical Models With Secondary Structure Insights for Orthoptera Phylogenetics.基于二级结构见解开发并应用RNA实证模型用于直翅目系统发育研究
Ecol Evol. 2025 Aug 31;15(9):e72068. doi: 10.1002/ece3.72068. eCollection 2025 Sep.
2
A New Comparative Framework for Estimating Selection on Synonymous Substitutions.一种用于估计同义替换选择的新比较框架。
Mol Biol Evol. 2025 Apr 1;42(4). doi: 10.1093/molbev/msaf068.
3
A new comparative framework for estimating selection on synonymous substitutions.一种用于估计同义替换选择的新比较框架。
bioRxiv. 2025 Feb 6:2024.09.17.613331. doi: 10.1101/2024.09.17.613331.
4
Role of the Connexin C-terminus in skin pattern formation of Zebrafish.连接蛋白C末端在斑马鱼皮肤模式形成中的作用。
BBA Adv. 2021 Mar 17;1:100006. doi: 10.1016/j.bbadva.2021.100006. eCollection 2021.
5
Evaluation of global HIV/SIV envelope gp120 RNA structure and evolution within and among infected hosts.全球范围内HIV/SIV包膜糖蛋白gp120 RNA在感染宿主内及宿主间的结构与进化评估。
Virus Evol. 2018 Jun 21;4(1):vey018. doi: 10.1093/ve/vey018. eCollection 2018 Jan.
6
Ecological niche modeling re-examined: A case study with the Darwin's fox.重新审视生态位建模:以达尔文狐为例的案例研究。
Ecol Evol. 2018 Apr 16;8(10):4757-4770. doi: 10.1002/ece3.4014. eCollection 2018 May.
7
The Effect of RNA Substitution Models on Viroid and RNA Virus Phylogenies.RNA 替换模型对类病毒和 RNA 病毒系统发育的影响。
Genome Biol Evol. 2018 Feb 1;10(2):657-666. doi: 10.1093/gbe/evx273.
8
Adaptive molecular evolution of gene reveals the evidence for positive diversifying selection in indigenous goat populations.基因的适应性分子进化揭示了本土山羊群体中正向多样化选择的证据。
Ecol Evol. 2017 Jun 7;7(14):5170-5180. doi: 10.1002/ece3.2919. eCollection 2017 Jul.
9
Modeling invasive species spread in Lake Champlain via evolutionary computations.通过进化计算模拟入侵物种在尚普兰湖的扩散。
Theory Biosci. 2011 Jun;130(2):145-52. doi: 10.1007/s12064-011-0122-3. Epub 2011 Feb 4.
10
CodonTest: modeling amino acid substitution preferences in coding sequences.CodonTest:建模编码序列中氨基酸替换偏好。
PLoS Comput Biol. 2010 Aug 19;6(8):e1000885. doi: 10.1371/journal.pcbi.1000885.