Practical Computer Science, Faculty of Technology, Bielefeld University, D-33615 Bielefeld, Germany.
Bioinformatics. 2010 Mar 1;26(5):632-9. doi: 10.1093/bioinformatics/btq014. Epub 2010 Jan 14.
Abstract shape analysis allows efficient computation of a representative sample of low-energy foldings of an RNA molecule. More comprehensive information is obtained by computing shape probabilities, accumulating the Boltzmann probabilities of all structures within each abstract shape. Such information is superior to free energies because it is independent of sequence length and base composition. However, up to this point, computation of shape probabilities evaluates all shapes simultaneously and comes with a computation cost which is exponential in the length of the sequence.
We device an approach called RapidShapes that computes the shapes above a specified probability threshold T by generating a list of promising shapes and constructing specialized folding programs for each shape to compute its share of Boltzmann probability. This aims at a heuristic improvement of runtime, while still computing exact probability values.
Evaluating this approach and several substrategies, we find that only a small proportion of shapes have to be actually computed. For an RNA sequence of length 400, this leads, depending on the threshold, to a 10-138 fold speed-up compared with the previous complete method. Thus, probabilistic shape analysis has become feasible in medium-scale applications, such as the screening of RNA transcripts in a bacterial genome.
RapidShapes is available via http://bibiserv.cebitec.uni-bielefeld.de/rnashapes
抽象形状分析允许高效计算 RNA 分子低能折叠的代表性样本。通过计算形状概率,可以获得更全面的信息,累积每个抽象形状内所有结构的玻尔兹曼概率。这种信息优于自由能,因为它与序列长度和碱基组成无关。然而,到目前为止,形状概率的计算同时评估所有形状,并且计算成本在序列长度上呈指数级增长。
我们设计了一种称为 RapidShapes 的方法,通过生成一系列有希望的形状列表,并为每个形状构建专门的折叠程序来计算其玻尔兹曼概率份额,从而在指定概率阈值 T 以上计算形状。这旨在启发式地提高运行时效率,同时仍然计算精确的概率值。
通过评估这种方法和几种子策略,我们发现实际上只需要计算一小部分形状。对于长度为 400 的 RNA 序列,根据阈值,与以前的完整方法相比,速度提高了 10-138 倍。因此,概率形状分析已经可以在中等规模的应用中实现,例如在细菌基因组中筛选 RNA 转录本。
RapidShapes 可通过 http://bibiserv.cebitec.uni-bielefeld.de/rnashapes 访问。