Suppr超能文献

通过解码基因组内隐藏的异质性揭示了多种新型启动子结构。

Multiple novel promoter-architectures revealed by decoding the hidden heterogeneity within the genome.

作者信息

Narlikar Leelavati

机构信息

Chemical Engineering Division, National Chemical Laboratory, Dr. Homi Bhabha Road, Pune 411008, India

出版信息

Nucleic Acids Res. 2014 Nov 10;42(20):12388-403. doi: 10.1093/nar/gku924. Epub 2014 Oct 17.

Abstract

An important question in biology is how different promoter-architectures contribute to the diversity in regulation of transcription initiation. A step forward has been the production of genome-wide maps of transcription start sites (TSSs) using high-throughput sequencing. However, the subsequent step of characterizing promoters and their functions is still largely done on the basis of previously established promoter-elements like the TATA-box in eukaryotes or the -10 box in bacteria. Unfortunately, a majority of promoters and their activities cannot be explained by these few elements. Traditional motif discovery methods that identify novel elements also fail here, because TSS neighborhoods are often highly heterogeneous containing no overrepresented motif. We present a new, organism-independent method that explicitly models this heterogeneity while unraveling different promoter-architectures. For example, in five bacteria, we detect the presence of a pyrimidine preceding the TSS under very specific circumstances. In tuberculosis, we show for the first time that the spacing between the bacterial 10-motif and TSS is utilized by the pathogen for dynamic gene-regulation. In eukaryotes, we identify several new elements that are important for development. Identified promoter-architectures show differential patterns of evolution, chromatin structure and TSS spread, suggesting distinct regulatory functions. This work highlights the importance of characterizing heterogeneity within high-throughput genomic data rather than analyzing average patterns of nucleotide composition.

摘要

生物学中的一个重要问题是不同的启动子结构如何导致转录起始调控的多样性。利用高通量测序绘制全基因组转录起始位点(TSS)图谱是向前迈出的一步。然而,后续对启动子及其功能进行表征的步骤,很大程度上仍基于先前确立的启动子元件,如真核生物中的TATA框或细菌中的-10框。不幸的是,大多数启动子及其活性无法用这少数几个元件来解释。识别新元件的传统基序发现方法在此也失效了,因为TSS附近区域往往高度异质,不存在过度富集的基序。我们提出了一种新的、不依赖生物体的方法,该方法在揭示不同启动子结构的同时,明确地对这种异质性进行建模。例如,在五种细菌中,我们发现在非常特定的情况下,TSS之前存在嘧啶。在结核杆菌中,我们首次表明病原体利用细菌10基序与TSS之间的间距进行动态基因调控。在真核生物中,我们鉴定出几个对发育很重要的新元件。所识别的启动子结构显示出不同的进化模式、染色质结构和TSS分布,表明其具有不同的调控功能。这项工作突出了表征高通量基因组数据中的异质性而非分析核苷酸组成平均模式的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c90d/4227772/51ca80ce35f3/gku924fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验