Department of Chemistry, New York University, 1001 Silver, 100 Washington Square East, New York, NY 10003, USA.
Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Nucleic Acids Res. 2018 Aug 21;46(14):7040-7051. doi: 10.1093/nar/gky524.
Designing novel RNA topologies is a challenge, with important therapeutic and industrial applications. We describe a computational pipeline for design of novel RNA topologies based on our coarse-grained RNA-As-Graphs (RAG) framework. RAG represents RNA structures as tree graphs and describes RNA secondary (2D) structure topologies (currently up to 13 vertices, ≈260 nucleotides). We have previously identified novel graph topologies that are RNA-like among these. Here we describe a systematic design pipeline and illustrate design for six broad design problems using recently developed tools for graph-partitioning and fragment assembly (F-RAG). Following partitioning of the target graph, corresponding atomic fragments from our RAG-3D database are combined using F-RAG, and the candidate atomic models are scored using a knowledge-based potential developed for 3D structure prediction. The sequences of the top scoring models are screened further using available tools for 2D structure prediction. The results indicate that our modular approach based on RNA-like topologies rather than specific 2D structures allows for greater flexibility in the design process, and generates a large number of candidate sequences quickly. Experimental structure probing using SHAPE-MaP for two sequences agree with our predictions and suggest that our combined tools yield excellent candidates for further sequence and experimental screening.
设计新型 RNA 拓扑结构是一项具有重要治疗和工业应用价值的挑战。我们描述了一种基于我们的粗粒 RNA-As-Graphs(RAG)框架设计新型 RNA 拓扑结构的计算流程。RAG 将 RNA 结构表示为树图,并描述 RNA 二级(2D)结构拓扑结构(目前最多可达 13 个顶点,约 260 个核苷酸)。我们之前已经在这些结构中确定了新型类似 RNA 的图拓扑结构。在这里,我们描述了一个系统的设计流程,并使用最近开发的用于图分区和片段组装(F-RAG)的工具说明了六个广泛的设计问题的设计。在目标图的分区之后,使用 F-RAG 组合来自我们的 RAG-3D 数据库的相应原子片段,并使用针对 3D 结构预测开发的基于知识的势能对候选原子模型进行评分。使用可用于 2D 结构预测的现有工具进一步筛选得分最高的模型的序列。结果表明,我们基于类似 RNA 的拓扑结构而不是特定 2D 结构的模块化方法在设计过程中具有更大的灵活性,并能快速生成大量候选序列。使用 SHAPE-MaP 对两个序列进行的实验结构探测与我们的预测一致,并表明我们的组合工具为进一步的序列和实验筛选提供了极好的候选序列。