Horton James S, Cherry Joshua L, Waugh Gretel, Taylor Tiffany B
Institut Cochin, Université Paris Cité, INSERM U1016, CNRS UMR 8104, Paris 75014, France.
Department of Life Sciences and Milner Centre for Evolution, University of Bath, Claverton Down, Bath, UK.
Mol Biol Evol. 2025 Jul 30;42(8). doi: 10.1093/molbev/msaf183.
Nucleotides across a genome do not mutate at equal frequencies. Instead, specific nucleotide positions can exhibit much higher mutation rates than the genomic average due to their immediate nucleotide neighbors. These "mutational hotspots" can play a prominent role in adaptive evolution, yet we lack knowledge of which short nucleotide sequences drive hotspots. In this work, we employ a combination of experimental evolution with Pseudomonas fluorescens and bioinformatic analysis of various Salmonella species to characterize a short nucleotide motif (≥8 bp) that can drive T:A→G:C mutation rates >1000-fold higher than the baseline T→G rate in bacteria. First, we experimentally confirm previous analysis showing that homopolymeric tracts (≥3) of G with a 3' T frequently mutate so that the T is replaced with a G, resulting in an extension of the guanine tract, i.e. GGGT → GGGG. We then demonstrate that the potency of this T:A→G:C hotspot is dependent on the nucleotides immediately flanking the GnT sequence. We find that the dinucleotide immediately 5' to a G4 tract and the dinucleotide immediately 3' to the T strongly affect the T:A→G:C mutation rate, which ranges from ∼5-fold higher than the typical rate to over 1000-fold higher depending on the flanking elements. GnT motifs are therefore comprised of several modular nucleotide components which each exert a significant, quantifiable effect on the mutation rate. This work advances our ability to accurately identify the position and quantify the mutagenicity of hotspot motifs predicated on short nucleotide sequences.
基因组中的核苷酸并非以相同频率发生突变。相反,由于其紧邻的核苷酸邻居,特定的核苷酸位置可能表现出比基因组平均水平高得多的突变率。这些“突变热点”在适应性进化中可能发挥重要作用,但我们尚不清楚哪些短核苷酸序列驱动了热点的产生。在这项研究中,我们结合荧光假单胞菌的实验进化和多种沙门氏菌物种的生物信息学分析,来表征一种短核苷酸基序(≥8 bp),该基序可使T:A→G:C突变率比细菌中的基线T→G率高出1000倍以上。首先,我们通过实验证实了先前的分析,即具有3'端T的G同聚物序列(≥3)经常发生突变,使得T被G取代,导致鸟嘌呤序列延长,即GGGT → GGGG。然后,我们证明了这个T:A→G:C热点的效力取决于紧邻GnT序列的核苷酸。我们发现紧邻G4序列5'端的二核苷酸和紧邻T 3'端的二核苷酸强烈影响T:A→G:C突变率,根据侧翼元件的不同,该突变率比典型速率高出约5倍至超过1000倍。因此,GnT基序由几个模块化的核苷酸成分组成,每个成分对突变率都有显著的、可量化的影响。这项工作提高了我们准确识别热点基序位置并量化其基于短核苷酸序列的诱变性的能力。