Ganjtabesh M, Zare-Mirakabad F, Nowzari-Dalini A
School of Mathematics, Statistics, and Computer Science, College of Science, University of Tehran, Tehran, Iran; School of Computer Science, Institute for Studies in Theoretical Physics and Mathematics (IPM), Tehran, Iran; Laboratoire d'Informatique (LIX), Ecole Polytechnique, Palaiseau CEDEX, 91128, France.
Department of Applied Mathematics, Faculty of Mathematics and Computer Science, Amirkabir University of Technology, Tehran, Iran.
EXCLI J. 2013 Jun 17;12:546-55. eCollection 2013.
In living systems, RNAs play important biological functions. The functional form of an RNA frequently requires a specific tertiary structure. The scaffold for this structure is provided by secondary structural elements that are hydrogen bonds within the molecule. Here, we concentrate on the inverse RNA folding problem. In this problem, an RNA secondary structure is given as a target structure and the goal is to design an RNA sequence that its structure is the same (or very similar) to the given target structure. Different heuristic search methods have been proposed for this problem. One common feature among these methods is to use a folding algorithm to evaluate the accuracy of the designed RNA sequence during the generation process. The well known folding algorithms take O(n(3)) times where n is the length of the RNA sequence. In this paper, we introduce a new algorithm called GGI-Fold based on multi-objective genetic algorithm and Gibbs sampling method for the inverse RNA folding problem. Our algorithm generates a sequence where its structure is the same or very similar to the given target structure. The key feature of our method is that it never uses any folding algorithm to improve the quality of the generated sequences. We compare our algorithm with RNA-SSD for some biological test samples. In all test samples, our algorithm outperforms the RNA-SSD method for generating a sequence where its structure is more stable.
在生命系统中,RNA发挥着重要的生物学功能。RNA的功能形式通常需要特定的三级结构。这种结构的支架由分子内的氢键形成的二级结构元件提供。在此,我们专注于RNA反向折叠问题。在这个问题中,给定一个RNA二级结构作为目标结构,目标是设计一个其结构与给定目标结构相同(或非常相似)的RNA序列。针对这个问题已经提出了不同的启发式搜索方法。这些方法的一个共同特点是在生成过程中使用折叠算法来评估设计的RNA序列的准确性。著名的折叠算法的时间复杂度为O(n(3)),其中n是RNA序列的长度。在本文中,我们针对RNA反向折叠问题引入了一种基于多目标遗传算法和吉布斯采样方法的新算法GGI-Fold。我们的算法生成一个其结构与给定目标结构相同或非常相似的序列。我们方法的关键特征是它从不使用任何折叠算法来提高生成序列的质量。我们将我们的算法与RNA-SSD针对一些生物学测试样本进行了比较。在所有测试样本中,我们的算法在生成结构更稳定的序列方面优于RNA-SSD方法。