Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA.
Nucleic Acids Res. 2012 Feb;40(3):1307-17. doi: 10.1093/nar/gkr804. Epub 2011 Oct 5.
RNA structural motifs are the building blocks of the complex RNA architecture. Identification of non-coding RNA structural motifs is a critical step towards understanding of their structures and functionalities. In this article, we present a clustering approach for de novo RNA structural motif identification. We applied our approach on a data set containing 5S, 16S and 23S rRNAs and rediscovered many known motifs including GNRA tetraloop, kink-turn, C-loop, sarcin-ricin, reverse kink-turn, hook-turn, E-loop and tandem-sheared motifs, with higher accuracy than the state-of-the-art clustering method. We also identified a number of potential novel instances of GNRA tetraloop, kink-turn, sarcin-ricin and tandem-sheared motifs. More importantly, several novel structural motif families have been revealed by our clustering analysis. We identified a highly asymmetric bulge loop motif that resembles the rope sling. We also found an internal loop motif that can significantly increase the twist of the helix. Finally, we discovered a subfamily of hexaloop motif, which has significantly different geometry comparing to the currently known hexaloop motif. Our discoveries presented in this article have largely increased current knowledge of RNA structural motifs.
RNA 结构基序是复杂 RNA 结构的构建块。鉴定非编码 RNA 结构基序是理解其结构和功能的关键步骤。在本文中,我们提出了一种从头开始识别 RNA 结构基序的聚类方法。我们将该方法应用于包含 5S、16S 和 23S rRNA 的数据集,重新发现了许多已知的基序,包括 GNRA 四联体、发夹环、C 环、sarcin-ricin、反向发夹环、钩环、E 环和串联剪切基序,其准确性高于最先进的聚类方法。我们还鉴定了一些潜在的新型 GNRA 四联体、发夹环、sarcin-ricin 和串联剪切基序的实例。更重要的是,我们的聚类分析揭示了几个新的结构基序家族。我们鉴定了一种高度不对称的凸起环基序,类似于绳套。我们还发现了一种内部环基序,可以显著增加螺旋的扭曲度。最后,我们发现了六联体基序的一个亚家族,其几何形状与目前已知的六联体基序有很大的不同。本文的发现极大地增加了当前对 RNA 结构基序的认识。