Rasmussen Thomas Kiel, Krink Thiemo
EVALife Group, Department of Computer Science, University of Aarhus, Ny Munkegade B540, DK-8000 C Aarhus, Denmark.
Biosystems. 2003 Nov;72(1-2):5-17. doi: 10.1016/s0303-2647(03)00131-x.
Multiple sequence alignment (MSA) is one of the basic problems in computational biology. Realistic problem instances of MSA are computationally intractable for exact algorithms. One way to tackle MSA is to use Hidden Markov Models (HMMs), which are known to be very powerful in the related problem domain of speech recognition. However, the training of HMMs is computationally hard and there is no known exact method that can guarantee optimal training within reasonable computing time. Perhaps the most powerful training method is the Baum-Welch algorithm, which is fast, but bears the problem of stagnation at local optima. In the study reported in this paper, we used a hybrid algorithm combining particle swarm optimization with evolutionary algorithms to train HMMs for the alignment of protein sequences. Our experiments show that our approach yields better alignments for a set of benchmark protein sequences than the most commonly applied HMM training methods, such as Baum-Welch and Simulated Annealing.
多序列比对(MSA)是计算生物学中的基本问题之一。对于精确算法而言,MSA 的实际问题实例在计算上是难以处理的。解决 MSA 的一种方法是使用隐马尔可夫模型(HMM),已知其在语音识别相关问题领域非常强大。然而,HMM 的训练在计算上很困难,并且没有已知的精确方法能够保证在合理的计算时间内实现最优训练。也许最强大的训练方法是 Baum-Welch 算法,它速度很快,但存在陷入局部最优的问题。在本文报道的研究中,我们使用了一种将粒子群优化与进化算法相结合的混合算法来训练 HMM 以进行蛋白质序列比对。我们的实验表明,对于一组基准蛋白质序列,我们的方法比最常用的 HMM 训练方法(如 Baum-Welch 和模拟退火)产生更好的比对结果。