Fakhry Carl T, Zarringhalam Kourosh, Kulkarni Rahul V
Department of Computer Science, University of Massachusetts Boston, Boston, MA, USA.
Department of Mathematics, University of Massachusetts Boston, Boston, MA, USA.
Methods Mol Biol. 2018;1737:47-56. doi: 10.1007/978-1-4939-7634-8_3.
CsrA/RsmA is a RNA-binding protein that functions as a global regulator controlling important processes such as virulence, secondary metabolism, motility, and biofilm formation in diverse bacterial species. The activity of CsrA/RsmA is regulated by small RNAs that contain multiple binding sites for the protein. The expression of these noncoding RNAs effectively sequesters the protein and reduces free cellular levels of CsrA/RsmA. While multiple bacterial small RNAs that bind to and regulate CsrA/RsmA levels have been discovered, it is anticipated that there are several such small RNAs that remain undiscovered. To assist in the discovery of these small RNAs, we have developed a bioinformatics approach that combines sequence- and structure-based features to predict small RNA regulators of CsrA/RsmA. This approach analyzes structural motifs in the ensemble of low energy secondary structures of known small RNA regulators of CsrA/RsmA and trains a binary classifier on these features. The proposed machine learning approach leads to several testable predictions for small RNA regulators of CsrA/RsmA, thereby complementing and accelerating experimental efforts aimed at discovery of noncoding RNAs in the CsrA/RsmA pathway.
CsrA/RsmA是一种RNA结合蛋白,作为一种全局调节因子,控制着多种细菌物种中的重要过程,如毒力、次级代谢、运动性和生物膜形成。CsrA/RsmA的活性受含有该蛋白多个结合位点的小RNA调控。这些非编码RNA的表达有效地隔离了该蛋白,并降低了细胞内CsrA/RsmA的游离水平。虽然已经发现了多种与CsrA/RsmA结合并调节其水平的细菌小RNA,但预计仍有一些此类小RNA未被发现。为了协助发现这些小RNA,我们开发了一种生物信息学方法,该方法结合基于序列和结构的特征来预测CsrA/RsmA的小RNA调节因子。这种方法分析了已知CsrA/RsmA小RNA调节因子低能量二级结构集合中的结构基序,并基于这些特征训练了一个二元分类器。所提出的机器学习方法对CsrA/RsmA的小RNA调节因子产生了几个可测试的预测,从而补充并加速了旨在发现CsrA/RsmA途径中非编码RNA的实验工作。