Le Shu-Yun, Chen Jih-H, Maize Jacob V
NCI Center for Cancer Research, National Cancer Institute, NIH, Frederick, MD 21702, USA.
Proc IEEE Comput Soc Bioinform Conf. 2003;2:190-6.
Distinct, local structures are frequently correlated with functional RNA elements involved in post-transcriptional regulation of gene expression. Discovery of microRNAs (miRNAs) suggests that there are a large class of small non-coding RNAs in eukaryotic genomes. These miRNAs have the potential to form distinct fold-back stem-loop structures. The prediction of those well-ordered folding sequences (WFS) in genomic sequences is very helpful for our understanding of RNA-based gene regulation and the determination of local RNA elements with structure-dependent functions. In this study, we describe a novel method for discovering the local WFS in a nucleotide sequence by Monte Carlo simulation and RNA folding. In the approach the quality of a local WFS is assessed by the energy difference (E(diff)) between the optimal structure folded in the local segment and its corresponding optimal, restrained structure where all the previous base pairings formed in the optimal structure are prohibited. Distinct WFS can be discovered by scanning successive segments along a sequence for evaluating the difference between E(diff) of the natural sequence and those computed from randomly shuffled sequences. Our results indicate that the statistically significant WFS detected in the genomic sequences of Caenorhabditis elegans (C.elegans) F49E12, T07C5, T07D1, T10H9, Y56A3A and Y71G12B are coincident with known fold-back stem-loops found in miRNA precursors. The potential and implications of our method in searching for miRNAs in genomes is discussed.
独特的局部结构常常与参与基因表达转录后调控的功能性RNA元件相关。微小RNA(miRNA)的发现表明,真核生物基因组中存在一大类小的非编码RNA。这些miRNA有潜力形成独特的回折茎环结构。预测基因组序列中那些有序折叠序列(WFS)对于我们理解基于RNA的基因调控以及确定具有结构依赖性功能的局部RNA元件非常有帮助。在本研究中,我们描述了一种通过蒙特卡罗模拟和RNA折叠在核苷酸序列中发现局部WFS的新方法。在该方法中,局部WFS的质量通过局部片段中折叠的最佳结构与其相应的最佳受限结构之间的能量差(E(diff))来评估,在最佳受限结构中,最佳结构中形成的所有先前碱基配对都被禁止。通过沿着序列扫描连续片段以评估天然序列的E(diff)与从随机洗牌序列计算得到的E(diff)之间的差异,可以发现不同的WFS。我们的结果表明,在秀丽隐杆线虫(C.elegans)F49E12、T07C5、T07D1、T10H9、Y56A3A和Y71G12B的基因组序列中检测到的具有统计学意义的WFS与miRNA前体中发现的已知回折茎环一致。讨论了我们的方法在基因组中搜索miRNA的潜力和意义。