Pavlu Simon, Nikumbh Sarvesh, Kovacik Martin, An Tadaichi, Lenhard Boris, Simkova Hana, Navratilova Pavla
Institute of Experimental Botany of the Czech Academy of Sciences, Slechtitelu 31, 77900 Olomouc, Czech Republic.
Department of Cell Biology and Genetics, Faculty of Science, Palacky University, Slechtitelu 27, 78371 Olomouc, Czech Republic.
Comput Struct Biotechnol J. 2023 Dec 5;23:264-277. doi: 10.1016/j.csbj.2023.12.003. eCollection 2024 Dec.
Precise localization and dissection of gene promoters are key to understanding transcriptional gene regulation and to successful bioengineering applications. The core RNA polymerase II initiation machinery is highly conserved among eukaryotes, leading to a general expectation of equivalent underlying mechanisms. Still, less is known about promoters in the plant kingdom. In this study, we employed cap analysis of gene expression (CAGE) at three embryonic developmental stages in barley to accurately map, annotate, and quantify transcription initiation events. Unsupervised discovery of de novo sequence clusters grouped promoters based on characteristic initiator and position-specific core-promoter motifs. This grouping was complemented by the annotation of transcription factor binding site (TFBS) motifs. Integration with genome-wide epigenomic data sets and gene ontology (GO) enrichment analysis further delineated the chromatin environments and functional roles of genes associated with distinct promoter categories. The TATA-box presence governs all features explored, supporting the general model of two separate genomic regulatory environments. We describe the extent and implications of alternative transcription initiation events, including those that are specific to developmental stages, which can affect the protein sequence or the presence of regions that regulate translation. The generated promoterome dataset provides a valuable genomic resource for enhancing the functional annotation of the barley genome. It also offers insights into the transcriptional regulation of individual genes and presents opportunities for the informed manipulation of promoter architecture, with the aim of enhancing traits of agronomic importance.
基因启动子的精确定位和剖析是理解转录基因调控以及成功进行生物工程应用的关键。核心RNA聚合酶II起始机制在真核生物中高度保守,这使得人们普遍预期其潜在机制是等效的。然而,关于植物界的启动子,我们了解得还较少。在本研究中,我们在大麦的三个胚胎发育阶段采用基因表达的帽分析(CAGE)技术,以准确绘制、注释和量化转录起始事件。通过对从头序列簇进行无监督发现,根据特征性起始子和位置特异性核心启动子基序对启动子进行分组。这种分组通过转录因子结合位点(TFBS)基序的注释得到补充。与全基因组表观基因组数据集和基因本体(GO)富集分析相结合,进一步描绘了与不同启动子类别相关基因的染色质环境和功能作用。TATA框的存在决定了所探索的所有特征,支持了两种独立基因组调控环境的一般模型。我们描述了替代转录起始事件的程度和影响,包括那些特定于发育阶段的事件,这些事件可能影响蛋白质序列或调控翻译区域的存在。生成的启动子组数据集为增强大麦基因组的功能注释提供了宝贵的基因组资源。它还为单个基因的转录调控提供了见解,并为明智地操纵启动子结构提供了机会,旨在增强具有农艺重要性的性状。