Qin Jing, Fricke Markus, Marz Manja, Stadler Peter F, Backofen Rolf
Department of Mathematics and Computer Science, Campusvej 55, DK-5230, Odense M, Denmark.
Bioinformatics/High Throughput Analysis Faculty of Mathematics und Computer Science Friedrich-Schiller-University, Leutragraben 1, D-07743 Jena, Germany.
Algorithms Mol Biol. 2014 Sep 11;9:19. doi: 10.1186/1748-7188-9-19. eCollection 2014.
Large RNA molecules are often composed of multiple functional domains whose spatial arrangement strongly influences their function. Pre-mRNA splicing, for instance, relies on the spatial proximity of the splice junctions that can be separated by very long introns. Similar effects appear in the processing of RNA virus genomes. Albeit a crude measure, the distribution of spatial distances in thermodynamic equilibrium harbors useful information on the shape of the molecule that in turn can give insights into the interplay of its functional domains.
Spatial distance can be approximated by the graph-distance in RNA secondary structure. We show here that the equilibrium distribution of graph-distances between a fixed pair of nucleotides can be computed in polynomial time by means of dynamic programming. While a naïve implementation would yield recursions with a very high time complexity of O(n (6) D (5)) for sequence length n and D distinct distance values, it is possible to reduce this to O(n (4)) for practical applications in which predominantly small distances are of of interest. Further reductions, however, seem to be difficult. Therefore, we introduced sampling approaches that are much easier to implement. They are also theoretically favorable for several real-life applications, in particular since these primarily concern long-range interactions in very large RNA molecules.
The graph-distance distribution can be computed using a dynamic programming approach. Although a crude approximation of reality, our initial results indicate that the graph-distance can be related to the smFRET data. The additional file and the software of our paper are available from http://www.rna.uni-jena.de/RNAgraphdist.html.
大型RNA分子通常由多个功能域组成,其空间排列对其功能有强烈影响。例如,前体mRNA剪接依赖于剪接位点的空间接近性,而这些位点可能被非常长的内含子隔开。类似的效应也出现在RNA病毒基因组的加工过程中。尽管是一种粗略的测量方法,但热力学平衡中空间距离的分布包含了有关分子形状的有用信息,进而可以深入了解其功能域之间的相互作用。
空间距离可以通过RNA二级结构中的图距离来近似。我们在此表明,通过动态规划可以在多项式时间内计算固定核苷酸对之间图距离的平衡分布。虽然简单的实现会产生时间复杂度非常高的递归,对于序列长度为n和D个不同距离值,时间复杂度为O(n(6)D(5)),但对于主要关注小距离的实际应用,可以将其降低到O(n(4))。然而,进一步的降低似乎很困难。因此,我们引入了更易于实现的采样方法。它们在理论上也有利于几个实际应用,特别是因为这些应用主要涉及非常大的RNA分子中的长程相互作用。
图距离分布可以使用动态规划方法计算。尽管是对现实的粗略近似,但我们的初步结果表明,图距离可以与单分子荧光共振能量转移(smFRET)数据相关。我们论文的附加文件和软件可从http://www.rna.uni-jena.de/RNAgraphdist.html获得。