Hamada Michiaki, Sato Kengo, Kiryu Hisanori, Mituyama Toutai, Asai Kiyoshi
Mizuho Information & Research Institute, Inc, Tokyo, Japan.
Bioinformatics. 2009 Jun 15;25(12):i330-8. doi: 10.1093/bioinformatics/btp228.
Secondary structure prediction of RNA sequences is an important problem. There have been progresses in this area, but the accuracy of prediction from an RNA sequence is still limited. In many cases, however, homologous RNA sequences are available with the target RNA sequence whose secondary structure is to be predicted.
In this article, we propose a new method for secondary structure predictions of individual RNA sequences by taking the information of their homologous sequences into account without assuming the common secondary structure of the entire sequences. The proposed method is based on posterior decoding techniques, which consider all the suboptimal secondary structures of the target and homologous sequences and all the suboptimal alignments between the target sequence and each of the homologous sequences. In our computational experiments, the proposed method provides better predictions than those performed only on the basis of the formation of individual RNA sequences and those performed by using methods for predicting the common secondary structure of the homologous sequences. Remarkably, we found that the common secondary predictions sometimes give worse predictions for the secondary structure of a target sequence than the predictions from the individual target sequence, while the proposed method always gives good predictions for the secondary structure of target sequences in all tested cases.
Supporting information and software are available online at: http://www.ncrna.org/software/centroidfold/ismb2009/.
Supplementary data are available at Bioinformatics online.
RNA序列的二级结构预测是一个重要问题。该领域已取得进展,但从RNA序列进行预测的准确性仍然有限。然而,在许多情况下,可获得与待预测二级结构的目标RNA序列同源的RNA序列。
在本文中,我们提出了一种新方法,用于通过考虑其同源序列的信息来预测单个RNA序列的二级结构,而无需假设整个序列具有共同的二级结构。所提出的方法基于后验解码技术,该技术考虑了目标序列和同源序列的所有次优二级结构以及目标序列与每个同源序列之间的所有次优比对。在我们的计算实验中,所提出的方法比仅基于单个RNA序列的形成进行的预测以及使用预测同源序列共同二级结构的方法进行的预测提供了更好的预测。值得注意的是,我们发现共同二级结构预测有时对目标序列二级结构的预测比来自单个目标序列的预测更差,而在所测试的所有情况下,所提出的方法始终能对目标序列的二级结构给出良好的预测。
支持信息和软件可在以下网址在线获取:http://www.ncrna.org/software/centroidfold/ismb2009/。
补充数据可在《生物信息学》在线获取。