Lloréns-Rico Verónica, Lluch-Senar Maria, Serrano Luis
EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain.
EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), Dr Aiguader 88, 08003 Barcelona, Spain Universitat Pompeu Fabra (UPF), Dr Aiguader 88, 08003 Barcelona, Spain
Nucleic Acids Res. 2015 Apr 20;43(7):3442-53. doi: 10.1093/nar/gkv170. Epub 2015 Mar 16.
Distinguishing between promoter-like sequences in bacteria that belong to true or abortive promoters, or to those that do not initiate transcription at all, is one of the important challenges in transcriptomics. To address this problem, we have studied the genome-reduced bacterium Mycoplasma pneumoniae, for which the RNAs associated with transcriptional start sites have been recently experimentally identified. We determined the contribution to transcription events of different genomic features: the -10, extended -10 and -35 boxes, the UP element, the bases surrounding the -10 box and the nearest-neighbor free energy of the promoter region. Using a random forest classifier and the aforementioned features transformed into scores, we could distinguish between true, abortive promoters and non-promoters with good -10 box sequences. The methods used in this characterization of promoters can be extended to other bacteria and have important applications for promoter design in bacterial genome engineering.
区分细菌中属于真正启动子或流产启动子的类似启动子序列,以及那些根本不启动转录的序列,是转录组学中的重要挑战之一。为了解决这个问题,我们研究了基因组简化的细菌肺炎支原体,最近通过实验确定了与转录起始位点相关的RNA。我们确定了不同基因组特征对转录事件的贡献:-10框、扩展-10框和-35框、上游元件、-10框周围的碱基以及启动子区域的最近邻自由能。使用随机森林分类器并将上述特征转换为分数,我们可以区分具有良好-10框序列的真正启动子、流产启动子和非启动子。这种启动子表征方法可扩展到其他细菌,并在细菌基因组工程中的启动子设计方面具有重要应用。