• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生成式深度学习模型的蓝藻合成启动子设计。

Design of synthetic promoters for cyanobacteria with generative deep-learning model.

机构信息

Department of Chemical Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam-Ro, Nam-Gu, Pohang, Gyeongbuk37673, Korea.

School of Energy and Chemical Engineering, Ulsan National Institute of Science and Technology (UNIST), 50 UNIST-Gil, Eonyang-Eup, Ulsan44919, Korea.

出版信息

Nucleic Acids Res. 2023 Jul 21;51(13):7071-7082. doi: 10.1093/nar/gkad451.

DOI:10.1093/nar/gkad451
PMID:37246641
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10359606/
Abstract

Deep generative models, which can approximate complex data distribution from large datasets, are widely used in biological dataset analysis. In particular, they can identify and unravel hidden traits encoded within a complicated nucleotide sequence, allowing us to design genetic parts with accuracy. Here, we provide a deep-learning based generic framework to design and evaluate synthetic promoters for cyanobacteria using generative models, which was in turn validated with cell-free transcription assay. We developed a deep generative model and a predictive model using a variational autoencoder and convolutional neural network, respectively. Using native promoter sequences of the model unicellular cyanobacterium Synechocystis sp. PCC 6803 as a training dataset, we generated 10 000 synthetic promoter sequences and predicted their strengths. By position weight matrix and k-mer analyses, we confirmed that our model captured a valid feature of cyanobacteria promoters from the dataset. Furthermore, critical subregion identification analysis consistently revealed the importance of the -10 box sequence motif in cyanobacteria promoters. Moreover, we validated that the generated promoter sequence can efficiently drive transcription via cell-free transcription assay. This approach, combining in silico and in vitro studies, will provide a foundation for the rapid design and validation of synthetic promoters, especially for non-model organisms.

摘要

深度生成模型可以从大型数据集近似复杂的数据分布,广泛应用于生物数据集分析。特别是,它们可以识别和揭示复杂核苷酸序列中编码的隐藏特征,从而使我们能够精确地设计遗传元件。在这里,我们提供了一个基于深度学习的通用框架,使用生成模型来设计和评估蓝藻的合成启动子,并用无细胞转录测定法对其进行验证。我们分别使用变分自动编码器和卷积神经网络开发了一个深度生成模型和一个预测模型。使用模型单细胞蓝藻 Synechocystis sp. PCC 6803 的天然启动子序列作为训练数据集,我们生成了 10000 个合成启动子序列并预测了它们的强度。通过位置权重矩阵和 k-mer 分析,我们证实了我们的模型从数据集中捕获了蓝藻启动子的有效特征。此外,关键亚区识别分析一致表明,-10 框序列基序在蓝藻启动子中非常重要。此外,我们通过无细胞转录测定验证了所生成的启动子序列可以有效地驱动转录。这种结合了计算机模拟和体外研究的方法将为合成启动子的快速设计和验证提供基础,特别是对于非模式生物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/365860d4ac6b/gkad451fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/d8ca2e7bde93/gkad451figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/a86c0f2f78c6/gkad451fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/1abfec5b4bb1/gkad451fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/fe7ce75f258f/gkad451fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/f99cd7b7ea04/gkad451fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/365860d4ac6b/gkad451fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/d8ca2e7bde93/gkad451figgra1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/a86c0f2f78c6/gkad451fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/1abfec5b4bb1/gkad451fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/fe7ce75f258f/gkad451fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/f99cd7b7ea04/gkad451fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2c94/10359606/365860d4ac6b/gkad451fig5.jpg

相似文献

1
Design of synthetic promoters for cyanobacteria with generative deep-learning model.基于生成式深度学习模型的蓝藻合成启动子设计。
Nucleic Acids Res. 2023 Jul 21;51(13):7071-7082. doi: 10.1093/nar/gkad451.
2
Diffusion-Based Generative Network for de Novo Synthetic Promoter Design.用于从头合成启动子设计的基于扩散的生成网络
ACS Synth Biol. 2024 May 17;13(5):1513-1522. doi: 10.1021/acssynbio.4c00041. Epub 2024 Apr 13.
3
Comparative Dose-Response Analysis of Inducible Promoters in Cyanobacteria.比较蓝藻中诱导启动子的剂量-反应分析。
ACS Synth Biol. 2020 Apr 17;9(4):843-855. doi: 10.1021/acssynbio.9b00505. Epub 2020 Mar 17.
4
Species-specific design of artificial promoters by transfer-learning based generative deep-learning model.基于迁移学习的生成式深度学习模型的物种特异性人工启动子设计。
Nucleic Acids Res. 2024 Jun 24;52(11):6145-6157. doi: 10.1093/nar/gkae429.
5
Fine-Tuning Gene Expression in Bacteria by Synthetic Promoters.通过合成启动子精细调控细菌中的基因表达。
Methods Mol Biol. 2024;2844:179-195. doi: 10.1007/978-1-0716-4063-0_12.
6
A successful hybrid deep learning model aiming at promoter identification.一个成功的混合深度学习模型,旨在进行启动子识别。
BMC Bioinformatics. 2022 May 31;23(Suppl 1):206. doi: 10.1186/s12859-022-04735-6.
7
Synthetic promoter design in Escherichia coli based on a deep generative network.基于深度生成网络的大肠杆菌合成启动子设计
Nucleic Acids Res. 2020 Jul 9;48(12):6403-6412. doi: 10.1093/nar/gkaa325.
8
Exploring native genetic elements as plug-in tools for synthetic biology in the cyanobacterium Synechocystis sp. PCC 6803.探索天然遗传元件作为蓝藻集胞藻 PCC 6803 中合成生物学的插件工具。
Microb Cell Fact. 2018 Mar 26;17(1):48. doi: 10.1186/s12934-018-0897-8.
9
A novel deep learning identifier for promoters and their strength using heterogeneous features.一种使用异构特征的新型深度学习启动子及其强度识别器。
Methods. 2024 Oct;230:119-128. doi: 10.1016/j.ymeth.2024.08.005. Epub 2024 Aug 19.
10
A deep learning based two-layer predictor to identify enhancers and their strength.一种基于深度学习的两层预测器,用于识别增强子及其强度。
Methods. 2023 Mar;211:23-30. doi: 10.1016/j.ymeth.2023.01.007. Epub 2023 Feb 3.

引用本文的文献

1
Deep learning guided programmable design of Escherichia coli core promoters from sequence architecture to strength control.深度学习指导下大肠杆菌核心启动子从序列结构到强度控制的可编程设计。
Nucleic Acids Res. 2025 Aug 27;53(16). doi: 10.1093/nar/gkaf863.
2
De novo promoter design method based on deep generative and dynamic evolution algorithm.基于深度生成与动态进化算法的从头启动子设计方法
Nucleic Acids Res. 2025 Aug 27;53(16). doi: 10.1093/nar/gkaf833.
3
Oligonucleotide library assisted sequence mining reveals promoter sequences with distinct temporal expression dynamics for applications in sp. AEP1-3.

本文引用的文献

1
Insight to Gene Expression From Promoter Libraries With the Machine Learning Workflow Exp2Ipynb.通过机器学习工作流程Exp2Ipynb从启动子文库洞察基因表达。
Front Bioinform. 2021 Oct 14;1:747428. doi: 10.3389/fbinf.2021.747428. eCollection 2021.
2
A synthetic promoter system for well-controlled protein expression with different carbon sources in Saccharomyces cerevisiae.在酿酒酵母中,使用不同碳源的合成启动子系统可实现对蛋白质表达的良好控制。
Microb Cell Fact. 2021 Oct 18;20(1):202. doi: 10.1186/s12934-021-01691-3.
3
A guide to machine learning for biologists.
寡核苷酸文库辅助序列挖掘揭示了具有不同时间表达动态的启动子序列,可应用于sp. AEP1-3。
Synth Biol (Oxf). 2025 May 21;10(1):ysaf001. doi: 10.1093/synbio/ysaf001. eCollection 2025.
4
Combining diffusion and transformer models for enhanced promoter synthesis and strength prediction in deep learning.结合扩散模型和变压器模型以增强深度学习中启动子的合成及强度预测
mSystems. 2025 Apr 22;10(4):e0018325. doi: 10.1128/msystems.00183-25. Epub 2025 Mar 19.
5
Microbial Technologies Enhanced by Artificial Intelligence for Healthcare Applications.用于医疗保健应用的人工智能增强微生物技术。
Microb Biotechnol. 2025 Mar;18(3):e70131. doi: 10.1111/1751-7915.70131.
6
Cell-Free Systems to Mimic and Expand Metabolism.用于模拟和扩展新陈代谢的无细胞系统。
ACS Synth Biol. 2025 Feb 21;14(2):316-322. doi: 10.1021/acssynbio.4c00729. Epub 2025 Jan 29.
7
Synthetic Promoters in Gene Therapy: Design Approaches, Features and Applications.基因治疗中的合成启动子:设计方法、特点及应用
Cells. 2024 Nov 27;13(23):1963. doi: 10.3390/cells13231963.
8
Towards AI-designed genomes using a variational autoencoder.迈向使用变分自编码器设计的人工智能基因组。
Proc Biol Sci. 2024 Dec;291(2036):20241457. doi: 10.1098/rspb.2024.1457. Epub 2024 Dec 11.
9
CAPE: a deep learning framework with Chaos-Attention net for Promoter Evolution.CAPE:用于启动子进化的具有混沌注意力网络的深度学习框架。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae398.
10
Fine-Tuning Gene Expression in Bacteria by Synthetic Promoters.通过合成启动子精细调控细菌中的基因表达。
Methods Mol Biol. 2024;2844:179-195. doi: 10.1007/978-1-0716-4063-0_12.
生物学机器学习指南。
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
4
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
5
Hybrid promoter engineering strategies in Yarrowia lipolytica: isoamyl alcohol production as a test study.解脂耶氏酵母中的杂交启动子工程策略:以异戊醇生产作为一项测试研究。
Biotechnol Biofuels. 2021 Jul 2;14(1):149. doi: 10.1186/s13068-021-02002-z.
6
Protein sequence design with deep generative models.利用深度生成模型进行蛋白质序列设计。
Curr Opin Chem Biol. 2021 Dec;65:18-27. doi: 10.1016/j.cbpa.2021.04.004. Epub 2021 May 26.
7
Cell-Free Transcription-Coupled CRISPR/Cas12a Assay for Prototyping Cyanobacterial Promoters.无细胞转录偶联 CRISPR/Cas12a 分析用于蓝藻启动子的原型设计。
ACS Synth Biol. 2021 Jun 18;10(6):1300-1307. doi: 10.1021/acssynbio.1c00148. Epub 2021 May 20.
8
DeepBAR: A Fast and Exact Method for Binding Free Energy Computation.DeepBAR:一种快速精确的结合自由能计算方法。
J Phys Chem Lett. 2021 Mar 18;12(10):2509-2515. doi: 10.1021/acs.jpclett.1c00189. Epub 2021 Mar 15.
9
Predictive design of sigma factor-specific promoters.σ 因子特异性启动子的预测设计。
Nat Commun. 2020 Nov 16;11(1):5822. doi: 10.1038/s41467-020-19446-w.
10
Expanding the toolbox for sp. PCC 6803: validation of replicative vectors and characterization of a novel set of promoters.扩展集胞藻6803的工具库:复制型载体的验证及一组新型启动子的表征
Synth Biol (Oxf). 2018 Aug 8;3(1):ysy014. doi: 10.1093/synbio/ysy014. eCollection 2018.