Suppr超能文献

一种利用RNA测序数据集识别非模式生物中启动子序列的计算框架。

A Computational Framework for Identifying Promoter Sequences in Nonmodel Organisms Using RNA-seq Data Sets.

作者信息

Wilson Erin H, Groom Joseph D, Sarfatis M Claire, Ford Stephanie M, Lidstrom Mary E, Beck David A C

机构信息

The Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, Washington 98195, United States.

Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States.

出版信息

ACS Synth Biol. 2021 Jun 18;10(6):1394-1405. doi: 10.1021/acssynbio.1c00017. Epub 2021 May 14.

Abstract

Engineering microorganisms into biological factories that convert renewable feedstocks into valuable materials is a major goal of synthetic biology; however, for many nonmodel organisms, we do not yet have the genetic tools, such as suites of strong promoters, necessary to effectively engineer them. In this work, we developed a computational framework that can leverage standard RNA-seq data sets to identify sets of constitutive, strongly expressed genes and predict strong promoter signals within their upstream regions. The framework was applied to a diverse collection of RNA-seq data measured for the methanotroph 5GB1 and identified 25 genes that were constitutively, strongly expressed across 12 experimental conditions. For each gene, the framework predicted short (27-30 nucleotide) sequences as candidate promoters and derived -35 and -10 consensus promoter motifs (TTGACA and TATAAT, respectively) for strong expression in . This consensus closely matches the canonical sigma-70 motif and was found to be enriched in promoter regions of the genome. A subset of promoter predictions was experimentally validated in a XylE reporter assay, including the consensus promoter, which showed high expression. The , , and promoter predictions were additionally screened in an experiment that scrambled the -35 and -10 signal sequences, confirming that transcription initiation was disrupted when these specific regions of the predicted sequence were altered. These results indicate that the computational framework can make biologically meaningful promoter predictions and identify key pieces of regulatory systems that can serve as foundational tools for engineering diverse microorganisms for biomolecule production.

摘要

将微生物改造成为能够将可再生原料转化为有价值材料的生物工厂是合成生物学的一个主要目标;然而,对于许多非模式生物而言,我们尚未拥有有效改造它们所需的遗传工具,例如一套强大的启动子。在这项工作中,我们开发了一个计算框架,该框架可以利用标准RNA测序数据集来识别组成型、强表达基因集,并预测其上游区域内的强启动子信号。该框架应用于为甲烷氧化菌5GB1测量的多种RNA测序数据,并鉴定出25个在12种实验条件下组成型、强表达的基因。对于每个基因,该框架预测短(27 - 30个核苷酸)序列作为候选启动子,并推导了用于在……中强表达的 -35和 -10共有启动子基序(分别为TTGACA和TATAAT)。这个共有序列与典型的sigma - 70基序紧密匹配,并且发现在基因组的启动子区域中富集。在XylE报告基因测定中对一部分启动子预测进行了实验验证,包括共有启动子,其显示出高表达。在一个打乱 -35和 -10信号序列的实验中还额外筛选了……、……和……启动子预测,证实当预测序列的这些特定区域改变时转录起始被破坏。这些结果表明,该计算框架可以做出具有生物学意义的启动子预测,并识别调控系统的关键部分,这些关键部分可作为用于改造多种微生物以进行生物分子生产的基础工具。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验