Department of Computational Biology, Faculty of Frontier Science, The University of Tokyo, Kashiwa, Chiba, Japan.
Bioinformatics. 2012 Apr 15;28(8):1093-101. doi: 10.1093/bioinformatics/bts097. Epub 2012 Feb 28.
Measuring the effects of base mutations is a powerful tool for functional and evolutionary analyses of RNA structures. To date, only a few methods have been developed for systematically computing the thermodynamic changes of RNA secondary structures in response to base mutations.
We have developed algorithms for computing the changes of the ensemble free energy, mean energy and the thermodynamic entropy of RNA secondary structures for exhaustive patterns of single and double mutations. The computational complexities are O(NW(2)) (where N is sequence length and W is maximal base pair span) for single mutations and O(N(2)W(2)) for double mutations with large constant factors. We show that the changes are relatively insensitive to GC composition and the maximal span constraint. The mean free energy changes are bounded ~7-9 kcal/mol and depend only weakly on position if sequence lengths are sufficiently large. For tRNA sequences, the most stabilizing mutations come from the change of the 5(')-most base of the anticodon loop. We also show that most of the base changes in the acceptor stem destabilize the structures, indicating that the nucleotide sequence in the acceptor stem is highly optimized for secondary structure stability. We investigate the 22 tRNA genes in the human mitochondrial genome and show that non-pathogenic polymorphisms tend to cause smaller changes in thermodynamic variables than generic mutations, suggesting that a mutation which largely increases thermodynamic variables has higher possibility to be a pathogenic or lethal mutation.
The C++ source code of the Rchange software is available at http://www.ncrna.org/software/rchange/.
测量碱基突变的影响是对 RNA 结构进行功能和进化分析的有力工具。迄今为止,仅开发了几种方法来系统地计算 RNA 二级结构对碱基突变的热力学变化。
我们开发了用于计算 RNA 二级结构的集合自由能、平均能量和热力学熵变化的算法,用于全面的单突变和双突变模式。对于单突变,计算复杂度为 O(NW(2))(其中 N 是序列长度,W 是最大碱基对跨度),对于双突变,计算复杂度为 O(N(2)W(2)),其中有较大的常数因子。我们表明,变化相对不依赖于 GC 组成和最大跨度约束。平均自由能变化约为 7-9 kcal/mol,如果序列长度足够大,则仅弱依赖于位置。对于 tRNA 序列,最稳定的突变来自反密码环的 5' 端碱基的变化。我们还表明,接受茎中的大多数碱基变化会使结构不稳定,这表明接受茎中的核苷酸序列高度优化了二级结构稳定性。我们研究了人类线粒体基因组中的 22 个 tRNA 基因,并表明非致病性多态性倾向于引起热力学变量的较小变化,这表明大大增加热力学变量的突变更有可能是致病性或致死性突变。
Rchange 软件的 C++源代码可在 http://www.ncrna.org/software/rchange/ 获得。