Forni Diego, Pozzoli Uberto, Mozzi Alessandra, Cagliani Rachele, Sironi Manuela
Scientific Institute IRCCS E. MEDEA, Bioinformatics, 23842 Bosisio Parini, Italy.
NAR Genom Bioinform. 2024 Jul 27;6(3):lqae088. doi: 10.1093/nargab/lqae088. eCollection 2024 Sep.
Dinucleotide biases have been widely investigated in the genomes of eukaryotes and viruses, but not in bacteria. We assembled a dataset of bacterial genomes (>15 000), which are representative of the genetic diversity in the kingdom Eubacteria, and we analyzed dinucleotide biases in relation to different traits. We found that TpA dinucleotides are the most depleted and that CpG dinucleotides show the widest dispersion. The abundances of both dinucleotides vary with genomic G + C content and show a very strong phylogenetic signal. After accounting for G + C content and phylogenetic inertia, we analyzed different bacterial lifestyle traits. We found that temperature preferences associate with the abundance of CpG dinucleotides, with thermophiles/hyperthemophiles being particularly depleted. Conversely, the TpA dinucleotide displays a bias that only depends on genomic G + C composition. Using predictions of intrinsic cyclizability we also show that CpG depletion may associate with higher DNA bendability in both thermophiles/hyperthermophiles and mesophiles, and that the former are predicted to have significantly more flexible genomes than the latter. We suggest that higher bendability is advantageous at high temperatures because it facilitates DNA positive supercoiling and that, through modulation of DNA mechanical properties, local or global CpG depletion controls genome organization, most likely not only in bacteria.
双核苷酸偏好性已在真核生物和病毒基因组中得到广泛研究,但在细菌基因组中尚未有此类研究。我们组装了一个细菌基因组数据集(超过15000个),这些基因组代表了真细菌界的遗传多样性,并且我们分析了与不同性状相关的双核苷酸偏好性。我们发现TpA双核苷酸的含量最低,而CpG双核苷酸的分布最为分散。这两种双核苷酸的丰度均随基因组G + C含量而变化,并显示出非常强的系统发育信号。在考虑了G + C含量和系统发育惯性之后,我们分析了不同的细菌生活方式性状。我们发现温度偏好与CpG双核苷酸的丰度相关,嗜热菌/超嗜热菌中的CpG双核苷酸尤其匮乏。相反,TpA双核苷酸表现出仅取决于基因组G + C组成的偏好性。通过对内在环化性的预测,我们还表明,在嗜热菌/超嗜热菌和嗜温菌中,CpG的匮乏可能与更高的DNA弯曲性相关,并且预计前者的基因组比后者具有更高的灵活性。我们认为,更高的弯曲性在高温下是有利的,因为它有助于DNA正向超螺旋,并且通过调节DNA的机械性能,局部或全局的CpG匮乏控制着基因组的组织,很可能不仅在细菌中如此。