基于最大似然法搜索估计替代率的最优模型的动态编程程序。

Dynamic programming procedure for searching optimal models to estimate substitution rates based on the maximum-likelihood method.

机构信息

National Key Laboratory of Crop Genetic Improvement and National Center of Plant Gene Research Wuhan, Huazhong Agricultural University, Wuhan 430070, China.

出版信息

Proc Natl Acad Sci U S A. 2011 May 10;108(19):7860-5. doi: 10.1073/pnas.1018621108. Epub 2011 Apr 26.

DOI:10.1073/pnas.1018621108

PMID:21521791

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3093512/

Abstract

The substitution rate in a gene can provide valuable information for understanding its functionality and evolution. A widely used method to estimate substitution rates is the maximum-likelihood method implemented in the CODEML program in the PAML package. A limited number of branch models, chosen based on a priori information or an interest in a particular lineage(s), are tested, whereas a large number of potential models are neglected. A complementary approach is also needed to test all or a large number of possible models to search for the globally optional model(s) of maximum likelihood. However, the computational time for this search even in a small number of sequences becomes impractically long. Thus, it is desirable to explore the most probable spaces to search for the optimal models. Using dynamic programming techniques, we developed a simple computational method for searching the most probable optimal branch-specific models in a practically feasible computational time. We propose three search methods to find the optimal models, which explored O(n) (method 1) to O(n(2)) (method 2 and method 3) models when the given phylogeny has n branches. In addition, we derived a formula to calculate the number of all possible models, revealing the complexity of finding the optimal branch-specific model. We show that in a reanalysis of over 50 previously published studies, the vast majority obtained better models with significantly higher likelihoods than the conventional hypothesis model methods.

摘要

基因的替换率可以为理解其功能和进化提供有价值的信息。一种广泛使用的估计替换率的方法是 PAML 包中的 CODEML 程序中的最大似然法。选择了有限数量的分支模型，这些模型是基于先验信息或对特定谱系的兴趣选择的，而忽略了大量潜在的模型。还需要一种补充方法来测试所有或大量可能的模型，以搜索最大似然的全局可选模型。然而，即使在少量序列中，这种搜索的计算时间也变得非常长。因此，探索最可能的空间以搜索最佳模型是可取的。我们使用动态规划技术，开发了一种简单的计算方法，用于在实际可行的计算时间内搜索最可能的最优分支特定模型。我们提出了三种搜索方法来找到最优模型，当给定的系统发生树有 n 个分支时，这三种方法分别探索了 O(n)（方法 1）到 O(n(2))（方法 2 和方法 3）的模型。此外，我们推导出了一个公式来计算所有可能模型的数量，揭示了找到最优分支特定模型的复杂性。我们表明，在对 50 多个先前发表的研究的重新分析中，绝大多数研究都获得了比传统假设模型方法更好的模型，具有更高的显著似然率。

相似文献

Dynamic programming procedure for searching optimal models to estimate substitution rates based on the maximum-likelihood method.基于最大似然法搜索估计替代率的最优模型的动态编程程序。

Proc Natl Acad Sci U S A. 2011 May 10;108(19):7860-5. doi: 10.1073/pnas.1018621108. Epub 2011 Apr 26.

A random effects branch-site model for detecting episodic diversifying selection.随机效应枝位点模型检测爆发式多样化选择。

Mol Biol Evol. 2011 Nov;28(11):3033-43. doi: 10.1093/molbev/msr125. Epub 2011 Jun 13.

Codon substitution models based on residue similarity and their applications.基于残基相似性的密码子替换模型及其应用。

Gene. 2012 Nov 1;509(1):136-41. doi: 10.1016/j.gene.2012.07.075. Epub 2012 Aug 10.

Simulation-based likelihood approach for evolutionary models of phenotypic traits on phylogeny.基于模拟的表型性状进化模型在系统发育上的似然方法。

Evolution. 2013 Feb;67(2):355-67. doi: 10.1111/j.1558-5646.2012.01775.x. Epub 2012 Sep 17.

Optimization strategies for fast detection of positive selection on phylogenetic trees.系统发育树上正选择快速检测的优化策略。

Bioinformatics. 2014 Apr 15;30(8):1129-1137. doi: 10.1093/bioinformatics/btt760. Epub 2014 Jan 2.

Non-homogeneous models of sequence evolution in the Bio++ suite of libraries and programs.Bio++库和程序套件中序列进化的非齐次模型。

BMC Evol Biol. 2008 Sep 22;8:255. doi: 10.1186/1471-2148-8-255.

Efficient selection of branch-specific models of sequence evolution.高效选择序列进化的分支特异性模型。

Mol Biol Evol. 2012 Jul;29(7):1861-74. doi: 10.1093/molbev/mss059. Epub 2012 Feb 2.

LMAP: Lightweight Multigene Analyses in PAML.LMAP：PAML中的轻量级多基因分析

BMC Bioinformatics. 2016 Sep 6;17(1):354. doi: 10.1186/s12859-016-1204-5.

NJML: a hybrid algorithm for the neighbor-joining and maximum-likelihood methods.NJML：一种用于邻接法和最大似然法的混合算法。

Mol Biol Evol. 2000 Sep;17(9):1401-9. doi: 10.1093/oxfordjournals.molbev.a026423.

A fast algorithm for joint reconstruction of ancestral amino acid sequences.一种用于联合重建祖先氨基酸序列的快速算法。

Mol Biol Evol. 2000 Jun;17(6):890-6. doi: 10.1093/oxfordjournals.molbev.a026369.

引用本文的文献

Writers, readers, and erasers of N6-Methyladenosine (m6A) methylomes in oilseed rape: identification, molecular evolution, and expression profiling.油菜籽中N6-甲基腺苷（m6A）甲基化组的书写者、阅读者和擦除者：鉴定、分子进化及表达谱分析

BMC Plant Biol. 2025 Feb 4;25(1):147. doi: 10.1186/s12870-025-06127-3.

The three-dimensional genome drives the evolution of asymmetric gene duplicates via enhancer capture-divergence.三维基因组通过增强子捕获-分化驱动不对称基因重复的进化。

Sci Adv. 2024 Dec 20;10(51):eadn6625. doi: 10.1126/sciadv.adn6625. Epub 2024 Dec 18.

The Genome Assembly of subsp. , a Widely Distributed Olive Close Relative.亚种的基因组组装，一种广泛分布的橄榄近缘种。

Front Genet. 2022 Aug 25;13:868540. doi: 10.3389/fgene.2022.868540. eCollection 2022.

Retrogene Duplication and Expression Patterns Shaped by the Evolution of Sex Chromosomes in Malaria Mosquitoes.性染色体进化塑造的疟蚊返基因重复和表达模式。

Genes (Basel). 2022 May 28;13(6):968. doi: 10.3390/genes13060968.

Comparative analysis of genomes data.基因组数据的比较分析。

Data Brief. 2021 Dec 2;39:107663. doi: 10.1016/j.dib.2021.107663. eCollection 2021 Dec.

Comparative Analysis of and Genetic Diversity in Locally-Adapted Kenyan Pigs and Their Wild Relatives, Warthogs.肯尼亚本地适应性猪及其野生近亲疣猪的[具体内容]和遗传多样性的比较分析。需注意，原文中“and”前后应该有具体所比较的关于猪和疣猪的某些方面，但这里缺失了，我只能按现有文本尽量准确翻译。

Vet Sci. 2021 Sep 2;8(9):180. doi: 10.3390/vetsci8090180.

The new chimeric chiron genes evolved essential roles in zebrafish embryonic development by regulating NAD levels.新的嵌合假基因通过调节 NAD 水平在斑马鱼胚胎发育中发挥了重要作用。

Sci China Life Sci. 2021 Nov;64(11):1929-1948. doi: 10.1007/s11427-020-1851-0. Epub 2021 Jan 27.

Evolutionary patterns of chimeric retrogenes in Oryza species.稻属物种中嵌合反转录基因的进化模式。

Sci Rep. 2019 Nov 27;9(1):17733. doi: 10.1038/s41598-019-54085-2.

Altered Transcription and Neofunctionalization of Duplicated Genes Rescue the Harmful Effects of a Chimeric Gene in .重复基因的转录改变和新功能化挽救了嵌合基因在……中的有害影响。

Plant Cell. 2016 Sep;28(9):2060-2078. doi: 10.1105/tpc.16.00281. Epub 2016 Aug 24.

Positive selection drives adaptive diversification of the 4-coumarate: CoA ligase (4CL) gene in angiosperms.正向选择驱动被子植物中4-香豆酸:辅酶A连接酶(4CL)基因的适应性多样化。

Ecol Evol. 2015 Aug;5(16):3413-20. doi: 10.1002/ece3.1613. Epub 2015 Jul 23.

本文引用的文献

Lineage-specific duplication and loss of pepsinogen genes in hominoid evolution.人科进化中胃蛋白酶原基因的谱系特异性复制和丢失。

J Mol Evol. 2010 Apr;70(4):313-24. doi: 10.1007/s00239-010-9320-8. Epub 2010 Mar 27.

Evolution of plant RNA polymerase IV/V genes: evidence of subneofunctionalization of duplicated NRPD2/NRPE2-like paralogs in Viola (Violaceae).植物 RNA 聚合酶 IV/V 基因的进化：堇菜属（堇菜科）中重复的 NRPD2/NRPE2 样基因座亚功能化的证据。

BMC Evol Biol. 2010 Feb 16;10:45. doi: 10.1186/1471-2148-10-45.

Divergence of recently duplicated M{gamma}-type MADS-box genes in Petunia.矮牵牛中近期复制的 M{gamma}-型 MADS 盒基因的分歧。

Mol Biol Evol. 2010 Feb;27(2):481-95. doi: 10.1093/molbev/msp279. Epub 2009 Nov 16.

Distinct evolutionary patterns between two duplicated color vision genes within cyprinid fishes.两种鲤形目鱼类中两个重复的颜色视觉基因之间的独特进化模式。

J Mol Evol. 2009 Oct;69(4):346-59. doi: 10.1007/s00239-009-9283-9. Epub 2009 Oct 17.

High molecular diversity in the rhodopsin gene in closely related goby fishes: A role for visual pigments in adaptive speciation?在亲缘关系密切的虾虎鱼中视蛋白基因具有高度的分子多样性：视觉色素在适应性物种形成中的作用？

Mol Phylogenet Evol. 2010 May;55(2):689-98. doi: 10.1016/j.ympev.2009.10.007. Epub 2009 Oct 12.

In defense of statistical methods for detecting positive selection.为检测正选择的统计方法辩护。

Proc Natl Acad Sci U S A. 2009 Sep 8;106(36):E95; author reply E96. doi: 10.1073/pnas.0904550106. Epub 2009 Aug 31.

Adaptive evolution of digestive RNASE1 genes in leaf-eating monkeys revisited: new insights from ten additional colobines.重新探讨食叶猴消化 RNASE1 基因的适应性进化：来自另外十只叶猴的新见解。

Mol Biol Evol. 2010 Jan;27(1):121-31. doi: 10.1093/molbev/msp216.

Pistillata--duplications as a mode for floral diversification in (Basal) asterids.花瓣发育基因Pistillata——重复作为（基部）菊类植物花多样化的一种模式。

Mol Biol Evol. 2009 Nov;26(11):2627-45. doi: 10.1093/molbev/msp181. Epub 2009 Aug 13.

Divergence in function and expression of the NOD26-like intrinsic proteins in plants.植物中NOD26样内在蛋白的功能与表达差异

BMC Genomics. 2009 Jul 15;10:313. doi: 10.1186/1471-2164-10-313.

The evolution of color vision in nocturnal mammals.夜行性哺乳动物的色觉进化。

Proc Natl Acad Sci U S A. 2009 Jun 2;106(22):8980-5. doi: 10.1073/pnas.0813201106. Epub 2009 May 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。