Alkan Can, Karakoç Emre, Nadeau Joseph H, Sahinalp S Cenk, Zhang Kaizhong
Department of Genome Sciences, University of Washington, Seattle, 98195, USA.
J Comput Biol. 2006 Mar;13(2):267-82. doi: 10.1089/cmb.2006.13.267.
Recent studies demonstrating the existence of special noncoding "antisense" RNAs used in post transcriptional gene regulation have received considerable attention. These RNAs are synthesized naturally to control gene expression in C. elegans, Drosophila, and other organisms; they are known to regulate plasmid copy numbers in E. coli as well. Small RNAs have also been artificially constructed to knock out genes of interest in humans and other organisms for the purpose of finding out more about their functions. Although there are a number of algorithms for predicting the secondary structure of a single RNA molecule, no such algorithm exists for reliably predicting the joint secondary structure of two interacting RNA molecules or measuring the stability of such a joint structure. In this paper, we describe the RNA-RNA interaction prediction (RIP) problem between an antisense RNA and its target mRNA and develop efficient algorithms to solve it. Our algorithms minimize the joint free energy between the two RNA molecules under a number of energy models with growing complexity. Because the computational resources needed by our most accurate approach is prohibitive for long RNA molecules, we also describe how to speed up our techniques through a number of heuristic approaches while experimentally maintaining the original accuracy. Equipped with this fast approach, we apply our method to discover targets for any given antisense RNA in the associated genome sequence.
最近的研究表明,存在用于转录后基因调控的特殊非编码“反义”RNA,这受到了广泛关注。这些RNA在秀丽隐杆线虫、果蝇和其他生物体中自然合成以控制基因表达;已知它们也能调节大肠杆菌中的质粒拷贝数。人们还人工构建了小RNA,用于敲除人类和其他生物体中感兴趣的基因,以便更多地了解其功能。尽管有许多算法可用于预测单个RNA分子的二级结构,但不存在可靠预测两个相互作用RNA分子的联合二级结构或测量这种联合结构稳定性的算法。在本文中,我们描述了反义RNA与其靶标mRNA之间的RNA - RNA相互作用预测(RIP)问题,并开发了有效的算法来解决它。我们的算法在多个复杂度不断增加的能量模型下,使两个RNA分子之间的联合自由能最小化。由于我们最精确方法所需的计算资源对于长RNA分子来说过高,我们还描述了如何通过多种启发式方法加快技术速度,同时在实验上保持原始精度。有了这种快速方法,我们将我们的方法应用于在相关基因组序列中发现任何给定反义RNA的靶标。