Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, China.
Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada.
PLoS Comput Biol. 2022 Jul 12;18(7):e1010293. doi: 10.1371/journal.pcbi.1010293. eCollection 2022 Jul.
RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as RNA binding proteins (RBPs) and in carrying out their cellular functions. In vivo and in vitro experiments such as RNAcompete and eCLIP have revealed in vitro binding preferences of RBPs to RNA oligomers and in vivo binding sites in cells. Analysis of these binding data showed that the structure properties of the RNAs in these binding sites are important determinants of the binding events; however, it has been a challenge to incorporate the structure information into an interpretable model. Here we describe a new approach, RNANetMotif, which takes predicted secondary structure of thousands of RNA sequences bound by an RBP as input and uses a graph theory approach to recognize enriched subgraphs. These enriched subgraphs are in essence shared sequence-structure elements that are important in RBP-RNA binding. To validate our approach, we performed RNA structure modeling via coarse-grained molecular dynamics folding simulations for selected 4 RBPs, and RNA-protein docking for LIN28B. The simulation results, e.g., solvent accessibility and energetics, further support the biological relevance of the discovered network subgraphs.
RNA 分子可以形成稳定的二级和三级结构,这对于介导与其他伴侣(如 RNA 结合蛋白 (RBPs))的物理相互作用以及发挥其细胞功能至关重要。RNAcompete 和 eCLIP 等体内和体外实验揭示了 RBPs 与 RNA 寡聚物的体外结合偏好以及细胞内的体内结合位点。对这些结合数据的分析表明,这些结合位点中 RNA 的结构特性是结合事件的重要决定因素;然而,将结构信息纳入可解释模型一直是一个挑战。在这里,我们描述了一种新方法 RNANetMotif,它将数千个被 RBP 结合的 RNA 序列的预测二级结构作为输入,并使用图论方法来识别富集的子图。这些富集的子图本质上是在 RBP-RNA 结合中重要的共享序列-结构元素。为了验证我们的方法,我们针对选定的 4 个 RBPs 进行了 RNA 结构建模,通过粗粒度分子动力学折叠模拟,以及对 LIN28B 进行了 RNA-蛋白质对接。模拟结果,例如溶剂可及性和能量学,进一步支持了发现的网络子图的生物学相关性。