Millman Adi, Dar Daniel, Shamir Maya, Sorek Rotem
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
Nucleic Acids Res. 2017 Jan 25;45(2):886-893. doi: 10.1093/nar/gkw749. Epub 2016 Aug 29.
A common strategy for regulation of gene expression in bacteria is conditional transcription termination. This strategy is frequently employed by 5'UTR cis-acting RNA elements (riboregulators), including riboswitches and attenuators. Such riboregulators can assume two mutually exclusive RNA structures, one of which forms a transcriptional terminator and results in premature termination, and the other forms an antiterminator that allows read-through into the coding sequence to produce a full-length mRNA. We developed a machine-learning based approach, which, given a 5'UTR of a gene, predicts whether it can form the two alternative structures typical to riboregulators employing conditional termination. Using a large positive training set of riboregulators derived from 89 human microbiome bacteria, we show high specificity and sensitivity for our classifier. We further show that our approach allows the discovery of previously unidentified riboregulators, as exemplified by the detection of new LeuA leaders and T-boxes in Streptococci Finally, we developed PASIFIC (www.weizmann.ac.il/molgen/Sorek/PASIFIC/), an online web-server that, given a user-provided 5'UTR sequence, predicts whether this sequence can adopt two alternative structures conforming with the conditional termination paradigm. This webserver is expected to assist in the identification of new riboswitches and attenuators in the bacterial pan-genome.
细菌中基因表达调控的一种常见策略是条件性转录终止。这种策略经常被5'UTR顺式作用RNA元件(核糖调节因子)所采用,包括核糖开关和弱化子。此类核糖调节因子可呈现两种相互排斥的RNA结构,其中一种形成转录终止子并导致提前终止,另一种形成抗终止子,使转录通读进入编码序列以产生全长mRNA。我们开发了一种基于机器学习的方法,该方法在给定基因的5'UTR的情况下,预测它是否能形成采用条件性终止的核糖调节因子典型的两种替代结构。使用从89种人类微生物组细菌中获得的大量核糖调节因子阳性训练集,我们展示了我们分类器的高特异性和敏感性。我们进一步表明,我们的方法能够发现以前未鉴定的核糖调节因子,如在链球菌中检测到新的LeuA前导序列和T盒所证明的那样。最后,我们开发了PASIFIC(www.weizmann.ac.il/molgen/Sorek/PASIFIC/),这是一个在线网络服务器,在给定用户提供的5'UTR序列时,预测该序列是否能采用符合条件性终止模式的两种替代结构。预计这个网络服务器将有助于在细菌泛基因组中鉴定新的核糖开关和弱化子。