School of Environmental Sciences, Jawaharlal Nehru University, New Delhi, India.
Centre for Systems Biology, School of Life Sciences, University of Hyderabad, Hyderabad, India.
BMC Genomics. 2019 Mar 12;20(1):206. doi: 10.1186/s12864-019-5570-z.
Promoter motifs in Entamoeba histolytica were earlier analysed using microarray data with lower dynamic range of gene expression. Additionally, previous transcriptomic studies did not provide information on the nature of highly transcribed genes, and downstream promoter motifs important for gene expression. To address these issues we generated RNA-Seq data and identified the high and low expressing genes, especially with respect to virulence potential. We analysed sequences both upstream and downstream of start site for important motifs.
We used RNA-Seq data to classify genes according to expression levels, which ranged six orders of magnitude. Data were validated by reporter gene expression. Virulence-related genes (except AIG1) were amongst the highly expressed, while some kinases and BspA family genes were poorly expressed. We looked for conserved motifs in sequences upstream and downstream of the initiation codon. Following enrichment by AME we found seven motifs significantly enriched in high expression- and three in low expression-classes. Two of these motifs (M4 and M6) were located downstream of AUG, were exclusively enriched in high expression class, and were mostly found in ribosomal protein, and translation-related genes. Motif deletion resulted in drastic down regulation of reporter gene expression, showing functional relevance. Distribution of core promoter motifs (TATA, GAAC, and Inr) in all genes revealed that genes with downstream motifs were not preferentially associated with TATA-less promoters. We looked at gene expression changes in cells subjected to growth stress by serum starvation, and experimentally validated the data. Genes showing maximum up regulation belonged to the low or medium expression class, and included genes in signalling pathways, lipid metabolism, DNA repair, Myb transcription factors, BspA, and heat shock. Genes showing maximum down regulation belonged to the high or medium expression class. They included genes for signalling factors, actin, Ariel family, and ribosome biogenesis factors.
Our analysis has added important new information about the E. histolytica transcriptome. We report for the first time two downstream motifs required for gene expression, which could be used for over expression of E. histolytica genes. Most of the virulence-related genes in this parasite are highly expressed in culture.
先前使用表达谱芯片数据对溶组织内阿米巴的启动子基序进行了分析,这些数据的基因表达动态范围较低。此外,先前的转录组研究并未提供有关高度转录基因的性质以及对基因表达重要的下游启动子基序的信息。为了解决这些问题,我们生成了 RNA-Seq 数据,并确定了高表达和低表达的基因,特别是与毒力潜能相关的基因。我们分析了启动子起始位点上下游的序列,以寻找重要基序。
我们使用 RNA-Seq 数据根据表达水平对基因进行分类,表达水平范围为六个数量级。通过报告基因表达对数据进行了验证。毒力相关基因(除 AIG1 外)均为高表达,而一些激酶和 BspA 家族基因表达水平较低。我们在起始密码子上下游的序列中寻找保守基序。通过 AME 富集后,我们在高表达类中发现了七个显著富集的基序,在低表达类中发现了三个。这两个基序(M4 和 M6)位于 AUG 下游,仅在高表达类中富集,主要存在于核糖体蛋白和翻译相关基因中。报告基因表达的缺失导致其表达水平急剧下调,表明这些基序具有功能相关性。所有基因核心启动子基序(TATA、GAAC 和 Inr)的分布表明,下游基序的基因与 TATA 缺失启动子无偏好性关联。我们观察了血清饥饿引起的细胞生长应激对基因表达的影响,并进行了实验验证。表达水平上调最大的基因属于低或中表达类,包括信号通路、脂质代谢、DNA 修复、Myb 转录因子、BspA 和热休克相关基因。表达水平下调最大的基因属于高或中表达类。它们包括信号因子、肌动蛋白、Ariel 家族和核糖体生物发生因子相关基因。
我们的分析为溶组织内阿米巴转录组增加了重要的新信息。我们首次报道了两个用于基因表达的下游基序,可用于溶组织内阿米巴基因的过表达。该寄生虫中的大多数毒力相关基因在培养中均高度表达。