López Yosvany, Vandenbon Alexis, Nose Akinao, Nakai Kenta
Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan.
Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan.
PeerJ. 2017 May 30;5:e3389. doi: 10.7717/peerj.3389. eCollection 2017.
Because transcription is the first step in the regulation of gene expression, understanding how transcription factors bind to their DNA binding motifs has become absolutely necessary. It has been shown that the promoters of genes with similar expression profiles share common structural patterns. This paper presents an extensive study of the regulatory regions of genes expressed in 24 developmental stages of . It proposes the use of a combination of structural features, such as positioning of individual motifs relative to the transcription start site, orientation, pairwise distance between motifs, and presence of motifs anywhere in the promoter for predicting gene expression from structural features of promoter sequences. RNA-sequencing data was utilized to create and validate the 24 models. When genes with high-scoring promoters were compared to those identified by RNA-seq samples, 19 (79.2%) statistically significant models, a number that exceeds previous studies, were obtained. Each model yielded a set of highly informative features, which were used to search for genes with similar biological functions.
由于转录是基因表达调控的第一步,因此了解转录因子如何与其DNA结合基序结合变得至关重要。研究表明,具有相似表达谱的基因的启动子具有共同的结构模式。本文对在[具体生物名称]的24个发育阶段表达的基因的调控区域进行了广泛研究。它提出结合使用多种结构特征,例如单个基序相对于转录起始位点的定位、方向、基序之间的成对距离以及启动子中任何位置的基序存在情况,以便从启动子序列的结构特征预测基因表达。利用RNA测序数据创建并验证了这24个模型。当将具有高分启动子的基因与通过RNA-seq样本鉴定的基因进行比较时,获得了19个(79.2%)具有统计学意义的模型,这一数字超过了以往的研究。每个模型都产生了一组信息丰富的特征,这些特征被用于搜索具有相似生物学功能的基因。