通过粒子群优化-进化算法混合改进用于多序列比对的隐马尔可夫模型训练

Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid.

作者信息

Rasmussen Thomas Kiel, Krink Thiemo

机构信息

EVALife Group, Department of Computer Science, University of Aarhus, Ny Munkegade B540, DK-8000 C Aarhus, Denmark.

出版信息

Biosystems. 2003 Nov;72(1-2):5-17. doi: 10.1016/s0303-2647(03)00131-x.

DOI:10.1016/s0303-2647(03)00131-x

PMID:14642655

Abstract

Multiple sequence alignment (MSA) is one of the basic problems in computational biology. Realistic problem instances of MSA are computationally intractable for exact algorithms. One way to tackle MSA is to use Hidden Markov Models (HMMs), which are known to be very powerful in the related problem domain of speech recognition. However, the training of HMMs is computationally hard and there is no known exact method that can guarantee optimal training within reasonable computing time. Perhaps the most powerful training method is the Baum-Welch algorithm, which is fast, but bears the problem of stagnation at local optima. In the study reported in this paper, we used a hybrid algorithm combining particle swarm optimization with evolutionary algorithms to train HMMs for the alignment of protein sequences. Our experiments show that our approach yields better alignments for a set of benchmark protein sequences than the most commonly applied HMM training methods, such as Baum-Welch and Simulated Annealing.

摘要

多序列比对（MSA）是计算生物学中的基本问题之一。对于精确算法而言，MSA 的实际问题实例在计算上是难以处理的。解决 MSA 的一种方法是使用隐马尔可夫模型（HMM），已知其在语音识别相关问题领域非常强大。然而，HMM 的训练在计算上很困难，并且没有已知的精确方法能够保证在合理的计算时间内实现最优训练。也许最强大的训练方法是 Baum-Welch 算法，它速度很快，但存在陷入局部最优的问题。在本文报道的研究中，我们使用了一种将粒子群优化与进化算法相结合的混合算法来训练 HMM 以进行蛋白质序列比对。我们的实验表明，对于一组基准蛋白质序列，我们的方法比最常用的 HMM 训练方法（如 Baum-Welch 和模拟退火）产生更好的比对结果。

相似文献

Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid.

Biosystems. 2003 Nov;72(1-2):5-17. doi: 10.1016/s0303-2647(03)00131-x.

Multiple Sequence Alignment with Hidden Markov Models Learned by Random Drift Particle Swarm Optimization.

IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):243-57. doi: 10.1109/TCBB.2013.148.

Using guide trees to construct multiple-sequence evolutionary HMMs.

Bioinformatics. 2003;19 Suppl 1:i147-57. doi: 10.1093/bioinformatics/btg1019.

ProbPFP: a multiple sequence alignment algorithm combining hidden Markov model optimized by particle swarm optimization with partition function.

BMC Bioinformatics. 2019 Nov 25;20(Suppl 18):573. doi: 10.1186/s12859-019-3132-7.

Training HMM structure with genetic algorithm for biological sequence analysis.

Bioinformatics. 2004 Dec 12;20(18):3613-9. doi: 10.1093/bioinformatics/bth454. Epub 2004 Aug 5.

HMM-ModE--improved classification using profile hidden Markov models by optimising the discrimination threshold and modifying emission probabilities with negative training sequences.

BMC Bioinformatics. 2007 Mar 27;8:104. doi: 10.1186/1471-2105-8-104.

A linear memory algorithm for Baum-Welch training.

BMC Bioinformatics. 2005 Sep 19;6:231. doi: 10.1186/1471-2105-6-231.

COACH: profile-profile alignment of protein families using hidden Markov models.

Bioinformatics. 2004 May 22;20(8):1309-18. doi: 10.1093/bioinformatics/bth091. Epub 2004 Feb 12.

HMM-Kalign: a tool for generating sub-optimal HMM alignments.

Bioinformatics. 2007 Nov 15;23(22):3095-7. doi: 10.1093/bioinformatics/btm492. Epub 2007 Oct 6.

Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

Genomics Proteomics Bioinformatics. 2008 Jun;6(2):98-110. doi: 10.1016/S1672-0229(08)60025-X.

引用本文的文献

learnMSA: learning and aligning large protein families.

Gigascience. 2022 Nov 18;11. doi: 10.1093/gigascience/giac104.

ProbPFP: a multiple sequence alignment algorithm combining hidden Markov model optimized by particle swarm optimization with partition function.

BMC Bioinformatics. 2019 Nov 25;20(Suppl 18):573. doi: 10.1186/s12859-019-3132-7.

SwarmDock and the use of normal modes in protein-protein docking.

Int J Mol Sci. 2010 Sep 28;11(10):3623-48. doi: 10.3390/ijms11103623.

A particle swarm based hybrid system for imbalanced medical data sampling.

BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S34. doi: 10.1186/1471-2164-10-S3-S34.

Optimized Particle Swarm Optimization (OPSO) and its application to artificial neural network training.

BMC Bioinformatics. 2006 Mar 10;7:125. doi: 10.1186/1471-2105-7-125.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过粒子群优化-进化算法混合改进用于多序列比对的隐马尔可夫模型训练

Improved Hidden Markov Model training for multiple sequence alignment by a particle swarm optimization-evolutionary algorithm hybrid.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献