Department of Genetics, Evolution and Environment, University College London, London, United Kingdom.
Mol Biol Evol. 2023 Apr 4;40(4). doi: 10.1093/molbev/msad041.
The CODEML program in the PAML package has been widely used to analyze protein-coding gene sequences to estimate the synonymous and nonsynonymous rates (dS and dN) and to detect positive Darwinian selection driving protein evolution. For users not familiar with molecular evolutionary analysis, the program is known to have a steep learning curve. Here, we provide a step-by-step protocol to illustrate the commonly used tests available in the program, including the branch models, the site models, and the branch-site models, which can be used to detect positive selection driving adaptive protein evolution affecting particular lineages of the species phylogeny, affecting a subset of amino acid residues in the protein, and affecting a subset of sites along prespecified lineages, respectively. A data set of the myxovirus (Mx) genes from ten mammal and two bird species is used as an example. We discuss a new feature in CODEML that allows users to perform positive selection tests for multiple genes for the same set of taxa, as is common in modern genome-sequencing projects. The PAML package is distributed at https://github.com/abacus-gene/paml under the GNU license, with support provided at its discussion site (https://groups.google.com/g/pamlsoftware). Data files used in this protocol are available at https://github.com/abacus-gene/paml-tutorial.
在 PAML 软件包中,CODEML 程序被广泛用于分析编码蛋白的基因序列,以估计同义与非同义突变率(dS 和 dN),并检测驱动蛋白进化的正向达尔文选择。对于不熟悉分子进化分析的用户来说,该程序的学习曲线较为陡峭。在这里,我们提供了一个分步协议,说明该程序中常用的测试,包括分支模型、位点模型和分支位点模型,这些模型可用于检测正向选择,从而驱动影响物种系统发育特定谱系的适应性蛋白进化,影响蛋白质中特定氨基酸残基子集,以及影响沿特定谱系的特定位置子集。我们将十个哺乳动物和两个鸟类物种的粘病毒(Mx)基因数据集作为一个示例。我们将讨论 CODEML 的一个新功能,该功能允许用户针对同一组分类单元对多个基因执行正选择测试,这在现代基因组测序项目中很常见。PAML 软件包在 GNU 许可证下发布于 https://github.com/abacus-gene/paml,其讨论网站(https://groups.google.com/g/pamlsoftware)提供支持。本协议中使用的数据文件可在 https://github.com/abacus-gene/paml-tutorial 获得。
Mol Biol Evol. 2023-4-4
BMC Bioinformatics. 2018-10-22
BMC Bioinformatics. 2016-9-6
Mol Biol Evol. 2007-8
Curr Protoc Bioinformatics. 2016-6-20
Methods Mol Biol. 2017
BMC Bioinformatics. 2010-5-27
NAR Genom Bioinform. 2025-8-27
BMC Biol. 2025-7-28
Mol Biol Evol. 2022-2-3
Mol Biol Evol. 2022-1-7
Mol Biol Evol. 2021-9-27
Bioinformatics. 2021-9-29
Mol Biol Evol. 2020-5-1
Bioinformatics. 2019-11-1
Ecol Evol. 2019-3-1
BMC Bioinformatics. 2018-7-25
Nat Ecol Evol. 2018-7-2