Frellsen Jes, Moltke Ida, Thiim Martin, Mardia Kanti V, Ferkinghoff-Borg Jesper, Hamelryck Thomas
The Bioinformatics Center, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
PLoS Comput Biol. 2009 Jun;5(6):e1000406. doi: 10.1371/journal.pcbi.1000406. Epub 2009 Jun 19.
The increasing importance of non-coding RNA in biology and medicine has led to a growing interest in the problem of RNA 3-D structure prediction. As is the case for proteins, RNA 3-D structure prediction methods require two key ingredients: an accurate energy function and a conformational sampling procedure. Both are only partly solved problems. Here, we focus on the problem of conformational sampling. The current state of the art solution is based on fragment assembly methods, which construct plausible conformations by stringing together short fragments obtained from experimental structures. However, the discrete nature of the fragments necessitates the use of carefully tuned, unphysical energy functions, and their non-probabilistic nature impairs unbiased sampling. We offer a solution to the sampling problem that removes these important limitations: a probabilistic model of RNA structure that allows efficient sampling of RNA conformations in continuous space, and with associated probabilities. We show that the model captures several key features of RNA structure, such as its rotameric nature and the distribution of the helix lengths. Furthermore, the model readily generates native-like 3-D conformations for 9 out of 10 test structures, solely using coarse-grained base-pairing information. In conclusion, the method provides a theoretical and practical solution for a major bottleneck on the way to routine prediction and simulation of RNA structure and dynamics in atomic detail.
非编码RNA在生物学和医学中日益增长的重要性引发了人们对RNA三维结构预测问题越来越浓厚的兴趣。与蛋白质的情况一样,RNA三维结构预测方法需要两个关键要素:精确的能量函数和构象采样程序。这两个要素都只是部分得到了解决。在这里,我们关注构象采样问题。目前的最优解决方案基于片段组装方法,该方法通过将从实验结构中获得的短片段串联在一起构建合理的构象。然而,片段的离散性质需要使用经过精心调整的、非物理的能量函数,并且它们的非概率性质会损害无偏采样。我们提供了一种消除这些重要限制的采样问题解决方案:一种RNA结构的概率模型,该模型允许在连续空间中对RNA构象进行高效采样,并带有相关概率。我们表明该模型捕捉到了RNA结构的几个关键特征,例如其旋转异构体性质和螺旋长度分布。此外,仅使用粗粒度的碱基配对信息,该模型就能为10个测试结构中的9个轻松生成类似天然的三维构象。总之,该方法为在原子水平上对RNA结构和动力学进行常规预测和模拟的道路上的一个主要瓶颈提供了理论和实际的解决方案。