一种利用RNA测序数据集识别非模式生物中启动子序列的计算框架。

A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets.

作者信息

Wilson Erin H, Groom Joseph D, Sarfatis M Claire, Ford Stephanie M, Lidstrom Mary E, Beck David A C

机构信息

The Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington 98195, United States.

Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States.

出版信息

ACS Synth Biol. 2021 Jun 18;10(6):1394-1405. doi: 10.1021/acssynbio.1c00017. Epub 2021 May 14.

DOI:10.1021/acssynbio.1c00017

PMID:33988977

Abstract

Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them. In this work, we developed a computational framework that can leverage standard RNA-seq data sets to identify sets of constitutive, strongly expressed genes and predict strong promoter signals within their upstream regions. The framework was applied to a diverse collection of RNA-seq data measured for the methanotroph 5GB1 and identified 25 genes that were constitutively, strongly expressed across 12 experimental conditions. For each gene, the framework predicted short (27-30 nucleotide) sequences as candidate promoters and derived -35 and -10 consensus promoter motifs (TTGACA and TATAAT, respectively) for strong expression in . This consensus closely matches the canonical sigma-70 motif and was found to be enriched in promoter regions of the genome. A subset of promoter predictions was experimentally validated in a XylE reporter assay, including the consensus promoter, which showed high expression. The , , and promoter predictions were additionally screened in an experiment that scrambled the -35 and -10 signal sequences, confirming that transcription initiation was disrupted when these specific regions of the predicted sequence were altered. These results indicate that the computational framework can make biologically meaningful promoter predictions and identify key pieces of regulatory systems that can serve as foundational tools for engineering diverse microorganisms for biomolecule production.

摘要

将微生物改造成为能够将可再生原料转化为有价值材料的生物工厂是合成生物学的一个主要目标；然而，对于许多非模式生物而言，我们尚未拥有有效改造它们所需的遗传工具，例如一套强大的启动子。在这项工作中，我们开发了一个计算框架，该框架可以利用标准RNA测序数据集来识别组成型、强表达基因集，并预测其上游区域内的强启动子信号。该框架应用于为甲烷氧化菌5GB1测量的多种RNA测序数据，并鉴定出25个在12种实验条件下组成型、强表达的基因。对于每个基因，该框架预测短（27 - 30个核苷酸）序列作为候选启动子，并推导了用于在……中强表达的 -35和 -10共有启动子基序（分别为TTGACA和TATAAT）。这个共有序列与典型的sigma - 70基序紧密匹配，并且发现在基因组的启动子区域中富集。在XylE报告基因测定中对一部分启动子预测进行了实验验证，包括共有启动子，其显示出高表达。在一个打乱 -35和 -10信号序列的实验中还额外筛选了……、……和……启动子预测，证实当预测序列的这些特定区域改变时转录起始被破坏。这些结果表明，该计算框架可以做出具有生物学意义的启动子预测，并识别调控系统的关键部分，这些关键部分可作为用于改造多种微生物以进行生物分子生产的基础工具。

相似文献

A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets.一种利用RNA测序数据集识别非模式生物中启动子序列的计算框架。

ACS Synth Biol. 2021 Jun 18;10(6):1394-1405. doi: 10.1021/acssynbio.1c00017. Epub 2021 May 14.

Inferring regulatory elements from a whole genome. An analysis of Helicobacter pylori sigma(80) family of promoter signals.从全基因组推断调控元件。幽门螺杆菌σ80启动子信号家族分析。

J Mol Biol. 2000 Mar 24;297(2):335-53. doi: 10.1006/jmbi.2000.3576.

Genome-wide prediction and validation of sigma70 promoters in Lactobacillus plantarum WCFS1.在植物乳杆菌 WCFS1 中全基因组预测和验证 sigma70 启动子。

PLoS One. 2012;7(9):e45097. doi: 10.1371/journal.pone.0045097. Epub 2012 Sep 20.

Assessing the effects of data selection and representation on the development of reliable E. coli sigma 70 promoter region predictors.评估数据选择和表示对可靠的大肠杆菌σ70启动子区域预测器开发的影响。

PLoS One. 2015 Mar 24;10(3):e0119721. doi: 10.1371/journal.pone.0119721. eCollection 2015.

The minus 35-recognition region of Escherichia coli sigma 70 is inessential for initiation of transcription at an "extended minus 10" promoter.大肠杆菌σ70的-35识别区域对于在“延伸的-10”启动子处起始转录并非必需。

J Mol Biol. 1993 Jul 20;232(2):406-18. doi: 10.1006/jmbi.1993.1400.

Structure of a Core Promoter in Bifidobacterium longum NCC2705.长双歧杆菌 NCC2705 核心启动子的结构。

J Bacteriol. 2020 Mar 11;202(7). doi: 10.1128/JB.00540-19.

Triad pattern algorithm for predicting strong promoter candidates in bacterial genomes.用于预测细菌基因组中强启动子候选序列的三联体模式算法

BMC Bioinformatics. 2008 May 9;9:233. doi: 10.1186/1471-2105-9-233.

Analysis of the nucleotide content of Escherichia coli promoter sequences related to the alternative sigma factors.大肠杆菌与替代σ因子相关启动子序列核苷酸含量分析。

J Mol Recognit. 2019 May;32(5):e2770. doi: 10.1002/jmr.2770. Epub 2018 Nov 20.

Genome-wide determination of transcription start sites reveals new insights into promoter structures in the actinomycete Corynebacterium glutamicum.全基因组转录起始位点的测定揭示了放线菌谷氨酸棒杆菌启动子结构的新见解。

J Biotechnol. 2017 Sep 10;257:99-109. doi: 10.1016/j.jbiotec.2017.04.008. Epub 2017 Apr 13.

Sigma70 promoters in Escherichia coli: specific transcription in dense regions of overlapping promoter-like signals.大肠杆菌中的西格玛70启动子：在重叠启动子样信号密集区域的特异性转录。

J Mol Biol. 2003 Oct 17;333(2):261-78. doi: 10.1016/j.jmb.2003.07.017.

引用本文的文献

Transcription regulation strategies in methylotrophs: progress and challenges.甲基营养菌中的转录调控策略：进展与挑战

Bioresour Bioprocess. 2022 Dec 12;9(1):126. doi: 10.1186/s40643-022-00614-3.

Construction of a broad-host-range Anderson promoter series and particulate methane monooxygenase promoter variants expand the methanotroph genetic toolbox.构建广泛宿主范围的安德森启动子系列和颗粒甲烷单加氧酶启动子变体扩展了甲烷营养菌的遗传工具箱。

Synth Syst Biotechnol. 2024 Feb 19;9(2):250-258. doi: 10.1016/j.synbio.2024.02.003. eCollection 2024 Jun.

A methanotrophic bacterium to enable methane removal for climate mitigation.一种能够去除甲烷以缓解气候变化的产甲烷菌。

Proc Natl Acad Sci U S A. 2023 Aug 29;120(35):e2310046120. doi: 10.1073/pnas.2310046120. Epub 2023 Aug 21.

Genetic Engineering of Resident Bacteria in the Gut Microbiome.肠道微生物组中常驻细菌的基因工程。

J Bacteriol. 2023 Jul 25;205(7):e0012723. doi: 10.1128/jb.00127-23. Epub 2023 Jun 29.

Understanding Autologous Spliceostatin Transcriptional Regulation to Derive Parts for Heterologous Expression in a Bacterial Host.理解自体剪接抑制转录调控，以在细菌宿主中获得异源表达的部分。

ACS Synth Biol. 2023 Jul 21;12(7):1952-1960. doi: 10.1021/acssynbio.3c00228. Epub 2023 Jun 20.

Engineered Sucrose Metabolism Improves the Smut Disease Suppression Potency of Pseudomonas sp. ST4.工程化蔗糖代谢提高假单胞菌 ST4 对黑粉病的抑制效力。

Appl Environ Microbiol. 2023 May 31;89(5):e0220822. doi: 10.1128/aem.02208-22. Epub 2023 Apr 24.

Maximizing the utility of public data.最大化公共数据的效用。

Front Genet. 2023 Mar 31;14:1106631. doi: 10.3389/fgene.2023.1106631. eCollection 2023.

Efficient biosynthesis of (R)-mandelic acid from styrene oxide by an adaptive evolutionary Gluconobacter oxydans STA.通过适应性进化的氧化葡萄糖酸杆菌STA从环氧苯乙烷高效生物合成（R）-扁桃酸。

Biotechnol Biofuels Bioprod. 2023 Jan 13;16(1):8. doi: 10.1186/s13068-023-02258-7.

Isolation and evaluation of strong endogenous promoters for the heterologous expression of proteins in Pichia pastoris.毕赤酵母中用于异源蛋白表达的强内源性启动子的分离和评估。

World J Microbiol Biotechnol. 2022 Sep 19;38(12):226. doi: 10.1007/s11274-022-03412-3.

Stochastic Simulations as a Tool for Assessing Signal Fidelity in Gene Expression in Synthetic Promoter Design.随机模拟作为评估合成启动子设计中基因表达信号保真度的工具

Biology (Basel). 2021 Jul 29;10(8):724. doi: 10.3390/biology10080724.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种利用RNA测序数据集识别非模式生物中启动子序列的计算框架。

A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献