Center for Computational Biology and Bioinformatics, Section of Integrative Biology in the School of Biological Sciences, University of Texas at Austin, Austin, TX 78712, USA.
J Mol Biol. 2011 Oct 21;413(2):473-83. doi: 10.1016/j.jmb.2011.08.033. Epub 2011 Aug 23.
RNA is directly associated with a growing number of functions within the cell. The accurate prediction of different RNA higher-order structures from their nucleic acid sequences will provide insight into their functions and molecular mechanics. We have been determining statistical potentials for a collection of structural elements that is larger than the number of structural elements determined with experimentally determined energy values. The experimentally derived free energies and the statistical potentials for canonical base-pair stacks are analogous, demonstrating that statistical potentials derived from comparative data can be used as an alternative energetic parameter. A new computational infrastructure-RNA Comparative Analysis Database (rCAD)-that utilizes a relational database was developed to manipulate and analyze very large sequence alignments and secondary-structure data sets. Using rCAD, we determined a richer set of energetic parameters for RNA fundamental structural elements including hairpin and internal loops. A new version of RNAfold was developed to utilize these statistical potentials. Overall, these new statistical potentials for hairpin and internal loops integrated into the new version of RNAfold demonstrated significant improvements in the prediction accuracy of RNA secondary structure.
RNA 直接参与了细胞内越来越多的功能。从核酸序列准确预测不同的 RNA 高级结构将有助于了解其功能和分子机制。我们一直在确定一组结构元素的统计势,其数量大于用实验确定的能量值确定的结构元素的数量。实验得出的自由能和规范碱基对堆叠的统计势是相似的,这表明可以将源自比较数据的统计势用作替代能量参数。开发了一种新的计算基础设施——RNA 比较分析数据库(rCAD),它利用关系数据库来操作和分析非常大的序列比对和二级结构数据集。使用 rCAD,我们确定了一组更丰富的 RNA 基本结构元素的能量参数,包括发夹和内部环。开发了一个新版本的 RNAfold 来利用这些统计势。总的来说,这些新的发夹和内部环统计势集成到新版本的 RNAfold 中,显著提高了 RNA 二级结构预测的准确性。