Hobolth Asger, Jensen Jens Ledet
Bioinformatics Research Center, University of Aarhus, Aarhus, Denmark.
J Comput Biol. 2005 Mar;12(2):186-203. doi: 10.1089/cmb.2005.12.186.
Identifying and characterizing the structure in genome sequences is one of the principal challenges in modern molecular biology, and comparative genomics offers a powerful tool. In this paper, we introduce a hidden Markov model that allows a comparative analysis of multiple sequences related by a phylogenetic tree, and we present an efficient method for estimating the parameters of the model. The model integrates structure prediction methods for one sequence, statistical multiple alignment methods, and phylogenetic information. This unified model is particularly useful for a detailed characterization of DNA sequences with a common gene. We illustrate the model on a variety of homologous sequences.
识别和表征基因组序列中的结构是现代分子生物学的主要挑战之一,而比较基因组学提供了一个强大的工具。在本文中,我们引入了一种隐马尔可夫模型,该模型允许对由系统发育树相关联的多个序列进行比较分析,并提出了一种估计模型参数的有效方法。该模型整合了针对单个序列的结构预测方法、统计多重比对方法和系统发育信息。这种统一的模型对于详细表征具有共同基因的DNA序列特别有用。我们在各种同源序列上展示了该模型。