Verdonk Hannah, Pivirotto Alyssa, Pavinato Vitor, Hey Jody, Pond Sergei Lk
Institute for Genomics and Evolutionary Medicine, Department of Biology, Temple University, Philadelphia, Pennsylvania, USA.
Center for Computational Genetics and Genomics, Department of Biology, Temple University, Philadelphia, Pennsylvania, USA.
bioRxiv. 2025 Feb 6:2024.09.17.613331. doi: 10.1101/2024.09.17.613331.
Selection on synonymous codon usage is a well known and widespread phenomenon, yet existing models often do not account for it or its effect on synonymous substitution rates. In this article, we develop and expand the capabilities of Multiclass Synonymous Substitution (MSS) models, which account for such selection by partitioning synonymous substitutions into two or more classes and estimating a relative substitution rate for each class, while accounting for important confounders like mutation bias. We identify extensive heterogeneity among relative synonymous substitution rates in an empirical dataset of ~12,000 gene alignments from twelve species. We validate model performance using data simulated under a forward population genetic simulation, demonstrating that MSS models are robust to model misspecification. MSS rates are significantly correlated with other covariates of selection on codon usage (population-level polymorphism data and tRNA abundance data), suggesting that models can detect weak signatures of selection on codon usage. With the MSS model, we can now study selection on synonymous substitutions in diverse taxa, independent of any assumptions about the forces driving that selection.
对同义密码子使用的选择是一种众所周知且广泛存在的现象,但现有的模型往往没有考虑到这一点,也没有考虑到它对同义替换率的影响。在本文中,我们开发并扩展了多类同义替换(MSS)模型的功能,该模型通过将同义替换划分为两个或更多类别,并估计每个类别的相对替换率来考虑这种选择,同时考虑到诸如突变偏向等重要的混杂因素。我们在一个来自12个物种的约12,000个基因比对的实证数据集中发现了相对同义替换率之间存在广泛的异质性。我们使用在前向群体遗传模拟下模拟的数据验证了模型性能,证明MSS模型对模型错误设定具有鲁棒性。MSS率与密码子使用选择的其他协变量(群体水平的多态性数据和tRNA丰度数据)显著相关,这表明模型可以检测到密码子使用选择的微弱信号。有了MSS模型,我们现在可以研究不同分类群中同义替换的选择,而无需对驱动该选择的力量做任何假设。