Department of Chemical Engineering, Pennsylvania State University, University Park, PA, 16801, USA.
Bioinformatics and Genomics, Pennsylvania State University, University Park, PA, 16801, USA.
Nat Commun. 2022 Sep 2;13(1):5159. doi: 10.1038/s41467-022-32829-5.
Transcription rates are regulated by the interactions between RNA polymerase, sigma factor, and promoter DNA sequences in bacteria. However, it remains unclear how non-canonical sequence motifs collectively control transcription rates. Here, we combine massively parallel assays, biophysics, and machine learning to develop a 346-parameter model that predicts site-specific transcription initiation rates for any σ promoter sequence, validated across 22132 bacterial promoters with diverse sequences. We apply the model to predict genetic context effects, design σ promoters with desired transcription rates, and identify undesired promoters inside engineered genetic systems. The model provides a biophysical basis for understanding gene regulation in natural genetic systems and precise transcriptional control for engineering synthetic genetic systems.
转录速率受到细菌中 RNA 聚合酶、σ 因子和启动子 DNA 序列相互作用的调控。然而,非典型序列基序如何共同控制转录速率仍不清楚。在这里,我们结合大规模平行测定、生物物理学和机器学习开发了一个 346 个参数的模型,该模型可预测任何 σ 启动子序列的特定位置转录起始速率,该模型在具有不同序列的 22132 个细菌启动子上进行了验证。我们应用该模型预测遗传背景效应,设计具有所需转录速率的 σ 启动子,并识别工程遗传系统内不理想的启动子。该模型为理解自然遗传系统中的基因调控和工程合成遗传系统中的精确转录控制提供了一个生物物理基础。