Department of Mathematics and Statistics, University of Calgary, 2500 University Drive NW, Calgary AB T2N 1N4, Canada and UMR CNRS 5558 - LBBE, Université Lyon 1, Villeurbanne Cedex, France.
Bioinformatics. 2014 Apr 1;30(7):1020-1. doi: 10.1093/bioinformatics/btt729. Epub 2013 Dec 18.
In recent years, there has been an increasing interest in the potential of codon substitution models for a variety of applications. However, the computational demands of these models have sometimes lead to the adoption of oversimplified assumptions, questionable statistical methods or a limited focus on small data sets.
Here, we offer a scalable, message-passing-interface-based Bayesian implementation of site-heterogeneous codon models in the mutation-selection framework. Our software jointly infers the global mutational parameters at the nucleotide level, the branch lengths of the tree and a Dirichlet process governing across-site variation at the amino acid level. We focus on an example estimation of the distribution of selection coefficients from an alignment of several hundred sequences of the influenza PB2 gene, and highlight the site-specific characterization enabled by such a modeling approach. Finally, we discuss future potential applications of the software for conducting evolutionary inferences.
The models are implemented within the PhyloBayes-MPI package, (available at phylobayes.org) along with usage details in the accompanying manual.
近年来,密码子替换模型在各种应用中的潜力引起了越来越多的关注。然而,这些模型的计算需求有时导致采用过于简化的假设、有问题的统计方法或对小数据集的有限关注。
在这里,我们在突变-选择框架中提供了一种基于消息传递接口的可扩展贝叶斯实现,用于站点异质密码子模型。我们的软件共同推断核苷酸水平的全局突变参数、树的分支长度以及氨基酸水平上控制跨站点变化的狄利克雷过程。我们专注于从几百个流感 PB2 基因序列的比对中估计选择系数分布的示例估计,并强调这种建模方法能够实现的站点特异性特征。最后,我们讨论了软件在进行进化推断方面的未来潜在应用。
这些模型在 PhyloBayes-MPI 包中实现(可在 phylobayes.org 上获得),并在随附的手册中提供了使用说明。