Klein Robert J, Eddy Sean R
Howard Hughes Medical Institute & Department of Genetics, Washington University School of Medicine, Saint Louis, Missouri 63110, USA.
BMC Bioinformatics. 2003 Sep 22;4:44. doi: 10.1186/1471-2105-4-44.
For many RNA molecules, secondary structure rather than primary sequence is the evolutionarily conserved feature. No programs have yet been published that allow searching a sequence database for homologs of a single RNA molecule on the basis of secondary structure.
We have developed a program, RSEARCH, that takes a single RNA sequence with its secondary structure and utilizes a local alignment algorithm to search a database for homologous RNAs. For this purpose, we have developed a series of base pair and single nucleotide substitution matrices for RNA sequences called RIBOSUM matrices. RSEARCH reports the statistical confidence for each hit as well as the structural alignment of the hit. We show several examples in which RSEARCH outperforms the primary sequence search programs BLAST and SSEARCH. The primary drawback of the program is that it is slow. The C code for RSEARCH is freely available from our lab's website.
RSEARCH outperforms primary sequence programs in finding homologs of structured RNA sequences.
对于许多RNA分子而言,二级结构而非一级序列才是进化上保守的特征。目前尚未有程序发布,能够基于二级结构在序列数据库中搜索单个RNA分子的同源物。
我们开发了一个名为RSEARCH的程序,它接受一个带有二级结构的RNA序列,并利用局部比对算法在数据库中搜索同源RNA。为此,我们针对RNA序列开发了一系列碱基对和单核苷酸替换矩阵,称为RIBOSUM矩阵。RSEARCH会报告每个匹配的统计置信度以及匹配的结构比对情况。我们展示了几个例子,其中RSEARCH的表现优于一级序列搜索程序BLAST和SSEARCH。该程序的主要缺点是速度慢。RSEARCH的C代码可从我们实验室的网站免费获取。
在寻找结构化RNA序列的同源物方面,RSEARCH比一级序列程序表现更优。