Washietl Stefan, Hofacker Ivo L, Stadler Peter F
Department of Theoretical Chemistry and Structural Biology, University of Vienna, Währingerstrasse 17, A-1090 Wien, Austria.
Proc Natl Acad Sci U S A. 2005 Feb 15;102(7):2454-9. doi: 10.1073/pnas.0409169102. Epub 2005 Jan 21.
We report an efficient method for detecting functional RNAs. The approach, which combines comparative sequence analysis and structure prediction, already has yielded excellent results for a small number of aligned sequences and is suitable for large-scale genomic screens. It consists of two basic components: (i) a measure for RNA secondary structure conservation based on computing a consensus secondary structure, and (ii) a measure for thermodynamic stability, which, in the spirit of a z score, is normalized with respect to both sequence length and base composition but can be calculated without sampling from shuffled sequences. Functional RNA secondary structures can be identified in multiple sequence alignments with high sensitivity and high specificity. We demonstrate that this approach is not only much more accurate than previous methods but also significantly faster. The method is implemented in the program rnaz, which can be downloaded from www.tbi.univie.ac.at/~wash/RNAz. We screened all alignments of length n > or = 50 in the Comparative Regulatory Genomics database, which compiles conserved noncoding elements in upstream regions of orthologous genes from human, mouse, rat, Fugu, and zebrafish. We recovered all of the known noncoding RNAs and cis-acting elements with high significance and found compelling evidence for many other conserved RNA secondary structures not described so far to our knowledge.
我们报告了一种检测功能性RNA的有效方法。该方法结合了比较序列分析和结构预测,对于少量比对序列已取得了优异结果,适用于大规模基因组筛选。它由两个基本部分组成:(i)基于计算共有二级结构的RNA二级结构保守性度量,以及(ii)热力学稳定性度量,该度量按照z分数的思路,相对于序列长度和碱基组成进行了归一化,但无需从随机序列中抽样即可计算。功能性RNA二级结构可以在多序列比对中以高灵敏度和高特异性被识别。我们证明该方法不仅比以前的方法准确得多,而且速度也明显更快。该方法在程序rnaz中实现,可从www.tbi.univie.ac.at/~wash/RNAz下载。我们筛选了比较调控基因组学数据库中所有长度n≥50的比对,该数据库汇编了来自人类、小鼠、大鼠、河豚和斑马鱼直系同源基因上游区域的保守非编码元件。我们以高显著性找回了所有已知的非编码RNA和顺式作用元件,并发现了许多据我们所知目前尚未描述的其他保守RNA二级结构的有力证据。