Yin Christopher, Castillo-Hair Sebastian, Byeon Gun Woo, Bromley Peter, Meuleman Wouter, Seelig Georg
Department of Electrical & Computer Engineering, University of Washington, Seattle, WA, USA.
Altius Institute for Biomedical Sciences, Seattle, WA, USA.
Cell Syst. 2025 Jun 4:101302. doi: 10.1016/j.cels.2025.101302.
An important and largely unsolved problem in synthetic biology is how to target gene expression to specific cell types. Here, we apply iterative deep learning to design synthetic enhancers with strong differential activity between two human cell lines. We initially train models on published datasets of enhancer activity and chromatin accessibility and use them to guide the design of synthetic enhancers that maximize predicted specificity. We experimentally validate these sequences, use the measurements to re-optimize the model, and design a second generation of enhancers with improved specificity. Our design methods embed relevant transcription factor binding site (TFBS) motifs with higher frequency than comparable endogenous enhancers while using a more selective motif vocabulary, and we show that enhancer activity is correlated with transcription factor expression at the single-cell level. Finally, we characterize causal features of top enhancers via perturbation experiments and show that enhancers as short as 50 bp can maintain specificity. A record of this paper's transparent peer review process is included in the supplemental information.
合成生物学中一个重要且很大程度上尚未解决的问题是如何将基因表达靶向特定细胞类型。在此,我们应用迭代深度学习来设计在两个人类细胞系之间具有强烈差异活性的合成增强子。我们最初在已发表的增强子活性和染色质可及性数据集上训练模型,并使用它们来指导合成增强子的设计,以最大化预测的特异性。我们通过实验验证这些序列,利用测量结果重新优化模型,并设计出特异性更高的第二代增强子。我们的设计方法以比可比的内源性增强子更高的频率嵌入相关转录因子结合位点(TFBS)基序,同时使用更具选择性的基序词汇表,并且我们表明增强子活性与单细胞水平的转录因子表达相关。最后,我们通过扰动实验表征顶级增强子的因果特征,并表明短至50 bp的增强子可以保持特异性。本文透明同行评审过程的记录包含在补充信息中。