Bekiaris Pavlos Stephanos, Tekath Tobias, Staiger Dorothee, Danisman Selahattin
RNA Biology and Molecular Physiology, Faculty of Biology, Bielefeld University, Bielefeld, Germany.
PLoS One. 2018 Jan 3;13(1):e0190421. doi: 10.1371/journal.pone.0190421. eCollection 2018.
Understanding the effect of cis-regulatory elements (CRE) and clusters of CREs, which are called cis-regulatory modules (CRM), in eukaryotic gene expression is a challenge of computational biology. We developed two programs that allow simple, fast and reliable analysis of candidate CREs and CRMs that may affect specific gene expression and that determine positional features between individual CREs within a CRM. The first program, "Exploration of Distinctive CREs and CRMs" (EDCC), correlates candidate CREs and CRMs with specific gene expression patterns. For pairs of CREs, EDCC also determines positional preferences of the single CREs in relation to each other and to the transcriptional start site. The second program, "CRM Network Generator" (CNG), prioritizes these positional preferences using a neural network and thus allows unbiased rating of the positional preferences that were determined by EDCC. We tested these programs with data from a microarray study of circadian gene expression in Arabidopsis thaliana. Analyzing more than 1.5 million pairwise CRE combinations, we found 22 candidate combinations, of which several contained known clock promoter elements together with elements that had not been identified as relevant to circadian gene expression before. CNG analysis further identified positional preferences of these CRE pairs, hinting at positional information that may be relevant for circadian gene expression. Future wet lab experiments will have to determine which of these combinations confer daytime specific circadian gene expression.
理解顺式调控元件(CRE)以及被称为顺式调控模块(CRM)的CRE簇在真核基因表达中的作用是计算生物学面临的一项挑战。我们开发了两个程序,可对可能影响特定基因表达的候选CRE和CRM进行简单、快速且可靠的分析,并确定CRM内各个CRE之间的位置特征。第一个程序“独特CRE和CRM探索”(EDCC),将候选CRE和CRM与特定基因表达模式相关联。对于成对的CRE,EDCC还能确定单个CRE相对于彼此以及转录起始位点的位置偏好。第二个程序“CRM网络生成器”(CNG),使用神经网络对这些位置偏好进行排序,从而能够对EDCC确定的位置偏好进行无偏评估。我们用来自拟南芥昼夜节律基因表达微阵列研究的数据对这些程序进行了测试。通过分析超过150万对成对的CRE组合,我们发现了22个候选组合,其中几个组合包含已知的生物钟启动子元件以及之前未被确定与昼夜节律基因表达相关的元件。CNG分析进一步确定了这些CRE对的位置偏好,暗示了可能与昼夜节律基因表达相关的位置信息。未来的湿实验室实验将必须确定这些组合中哪些赋予了白天特定的昼夜节律基因表达。