Laboratoire de Physique Théorique, CNRS, Sorbonne Université, Paris, France de la Matière Condensée, CNRS, Sorbonne Université, Paris, France.
Structure et Instabilité des Génomes, Museum National d'Histoire Naturelle, CNRS, INSERM, Paris, France.
Nucleic Acids Res. 2024 Jul 8;52(12):6802-6810. doi: 10.1093/nar/gkae468.
The computational design of synthetic DNA sequences with designer in vivo properties is gaining traction in the field of synthetic genomics. We propose here a computational method which combines a kinetic Monte Carlo framework with a deep mutational screening based on deep learning predictions. We apply our method to build regular nucleosome arrays with tailored nucleosomal repeat lengths (NRL) in yeast. Our design was validated in vivo by successfully engineering and integrating thousands of kilobases long tandem arrays of computationally optimized sequences which could accommodate NRLs much larger than the yeast natural NRL (namely 197 and 237 bp, compared to the natural NRL of ∼165 bp). RNA-seq results show that transcription of the arrays can occur but is not driven by the NRL. The computational method proposed here delineates the key sequence rules for nucleosome positioning in yeast and should be easily applicable to other sequence properties and other genomes.
具有设计体内特性的合成 DNA 序列的计算设计在合成基因组学领域引起了关注。我们在这里提出了一种计算方法,该方法将动力学蒙特卡罗框架与基于深度学习预测的深度突变筛选相结合。我们将我们的方法应用于在酵母中构建具有定制核小体重复长度(NRL)的规则核小体阵列。我们的设计通过成功地工程化和整合数千个千碱基长的计算优化序列的串联阵列在体内得到了验证,这些序列可以容纳比酵母天然 NRL 大得多的 NRL(即 197 和 237 bp,相比之下,天然 NRL 约为 165 bp)。RNA-seq 结果表明,阵列的转录可以发生,但不受 NRL 的驱动。这里提出的计算方法描绘了酵母中核小体定位的关键序列规则,并且应该易于应用于其他序列特性和其他基因组。