使用模拟退火算法对多个RNA序列中的可能二级结构进行比对。

Alignment of possible secondary structures in multiple RNA sequences using simulated annealing.

作者信息

Kim J, Cole J R, Pramanik S

机构信息

Department of Computer Science, Michigan State University, East Lansing 48824, USA. kimj@cps

出版信息

Comput Appl Biosci. 1996 Aug;12(4):259-67. doi: 10.1093/bioinformatics/12.4.259.

DOI:10.1093/bioinformatics/12.4.259

PMID:8902352

Abstract

Multiple sequence alignment has been a useful technique for identifying RNA secondary structures. In this paper, an algorithm for aligning multiple RNA sequences to identify possible secondary structure is presented. In this algorithm, dot matrices generated from intra-sequence comparisons are used to obtain possible common secondary structures. A hit probability for dot matrices is calculated and a score function based on this hit probability is defined. Simulated annealing is applied to optimize the score function. The solution set of multiple sequence alignment is introduced, and the effects on the solution set of increasing the number of alignment gaps and the alignment length are analyzed. Several additional strategies to reduce simulated annealing time are applied. A method is applied to reduce the computation time based on the solution set. Also, an optimized transition rule, double shuffle, which moves two positions in a sequence with each iteration, is applied to increase the rate of convergence. This algorithm was used to find possible common secondary structures in RNA sequences.

摘要

多序列比对一直是识别RNA二级结构的一种有用技术。本文提出了一种用于比对多个RNA序列以识别可能二级结构的算法。在该算法中，由序列内比较生成的点矩阵用于获得可能的共同二级结构。计算点矩阵的命中概率，并基于此命中概率定义一个评分函数。应用模拟退火来优化评分函数。引入了多序列比对的解集，并分析了增加比对空位数量和比对长度对解集的影响。应用了几种减少模拟退火时间的附加策略。基于解集应用了一种减少计算时间的方法。此外，应用了一种优化的转移规则——双重洗牌，每次迭代在序列中移动两个位置，以提高收敛速度。该算法用于在RNA序列中寻找可能的共同二级结构。