Koers Cooper, Bierman Rob, Xu Huixin, Akey Joshua M
Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton NJ. 08540, USA.
bioRxiv. 2025 Aug 28:2025.08.28.667654. doi: 10.1101/2025.08.28.667654.
The ratio of nonsynonymous (d) to synonymous (d) substitutions in protein-coding genes is a fundamental metric in molecular evolution to test hypotheses about the relative contributions of genetic drift and natural selection in shaping patterns of protein divergence (Williams et al., 2020). However, interpretation of d/d ratios may be confounded by sequence context and specific substitution models (Hughes, 2007; Kryazhimskiy & Plotkin, 2008). We present MutagenesisForge, a modular command-line tool and Python package for simulating codon-level mutagenesis and calculating d/d under user-specified conditions. At its core is the MutationModel interface which supports specific substitution matrices and ensures consistency across both Exhaustive and Contextual modes of simulation. These modes allow for users to test evolutionary hypotheses or to generate null distributions of d/d across a range of biologically relevant models. As large-scale DNA sequencing data sets continue to be generated both within and between species, MutagenesisForge offers a flexible platform for evolutionary analysis and hypothesis testing of mutational processes in protein-coding genes.
蛋白质编码基因中非同义(dN)与同义(dS)替换的比率是分子进化中的一个基本指标,用于检验关于遗传漂变和自然选择在塑造蛋白质差异模式方面相对贡献的假设(Williams等人,2020年)。然而,dN/dS比率的解释可能会因序列背景和特定替换模型而混淆(Hughes,2007年;Kryazhimskiy和Plotkin,2008年)。我们展示了MutagenesisForge,这是一个模块化的命令行工具和Python包,用于模拟密码子水平的诱变,并在用户指定的条件下计算dN/dS。其核心是MutationModel接口,它支持特定的替换矩阵,并确保在穷举和上下文模拟模式下的一致性。这些模式允许用户检验进化假设,或在一系列生物学相关模型中生成dN/dS的零分布。随着物种内部和物种之间不断产生大规模DNA测序数据集,MutagenesisForge为蛋白质编码基因突变过程的进化分析和假设检验提供了一个灵活的平台。