Bergig Oriel, Barash Danny, Nudler Evgeny, Kedem Klara
Department of Computer Science, Ben-Gurion University, Beer Sheva 84105, Israel.
In Silico Biol. 2004;4(4):593-604.
Traditional sequence-based search methods such as BLAST and FASTA can be used to identify sequence similarities. Recently, there is a growing interest in performing RNA shape similarity searches inside selected genes to locate RNA structure motifs that are known to possess functionally important roles. For example, in the newly discovered RNA genetic control elements called "riboswitches", the box domain is known to be highly conserved among various bacterial species in both its nucleotide composition and shape. However, in non-bacterial species, shape conservation is likely to become more important than sequence conservation when searching for riboswitch patterns. For this purpose, we present an approach tailored for detecting RNA shape similarities. We extend the Structure to String (ST R2) method that was initially proposed to locate shape similarities in proteins to identify predicted secondary structures of RNAs. The ST R2 for RNAs is a translation of a secondary structure to a string of characters, after which known sequence-based search algorithms with an efficient implementation are being used. We validate that the ST R2 succeeds to locate G-box riboswitches in prokaryotes, as expected. Subsequently we show running examples when attempting to detect G-box riboswitch candidates in eukaryotes.
传统的基于序列的搜索方法,如BLAST和FASTA,可用于识别序列相似性。最近,人们越来越有兴趣在选定基因内进行RNA形状相似性搜索,以定位已知具有重要功能作用的RNA结构基序。例如,在新发现的称为“核糖开关”的RNA遗传控制元件中,已知盒结构域在各种细菌物种中,其核苷酸组成和形状都高度保守。然而,在非细菌物种中,在搜索核糖开关模式时,形状保守可能比序列保守变得更加重要。为此,我们提出了一种专门用于检测RNA形状相似性的方法。我们扩展了最初提出用于定位蛋白质形状相似性的结构到字符串(STR2)方法,以识别RNA的预测二级结构。RNA的STR2是将二级结构翻译成一串字符,之后使用具有高效实现的已知基于序列的搜索算法。我们验证了STR2如预期那样成功地在原核生物中定位了G-盒核糖开关。随后,我们展示了在尝试检测真核生物中G-盒核糖开关候选物时的运行示例。