Suppr超能文献

利用系统发育因子隐马尔可夫模型在DNA序列比对中区分速率异质性和种间重组。

Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial hidden Markov models.

作者信息

Husmeier Dirk

机构信息

Biomathematics and Statistics, Scotland, Edinburgh, UK.

出版信息

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii166-72. doi: 10.1093/bioinformatics/bti1127.

Abstract

MOTIVATION

A recently proposed method for detecting recombination in DNA sequence alignments is based on the combination of hidden Markov models (HMMs) with phylogenetic trees. Although this method was found to detect breakpoints of recombinant regions more accurately than most existing techniques, it inherently fails to distinguish between recombination and rate variation. In the present paper, we propose to marry the phylogenetic tree to a factorial HMM (FHMM). The states of the first hidden chain represent tree topologies, whereas the states of the second independent hidden chain represent different global scaling factors of the branch lengths. Inference is done in terms of a hierarchical Bayesian model, where parameters and hidden states are sampled from the posterior distribution with Gibbs sampling.

RESULTS

We have tested the proposed model on various synthetic and real-world DNA sequence alignments. The simulation results suggest that as opposed to the standard phylogenetic HMM, the phylogenetic FHMM clearly distinguishes between recombination and rate heterogeneity and thereby avoids the prediction of spurious recombinant regions.

AVAILABILITY

The proposed method has been implemented in a MATLAB package that extends Kevin Murphy's HMM toolbox. Software and data used in our study are available from http://www.bioss.sari.ac.uk/~dirk/Supplements

摘要

动机

最近提出的一种用于检测DNA序列比对中重组的方法是基于隐马尔可夫模型(HMM)与系统发育树的结合。尽管该方法被发现比大多数现有技术更准确地检测重组区域的断点,但它本质上无法区分重组和速率变化。在本文中,我们建议将系统发育树与因子隐马尔可夫模型(FHMM)相结合。第一个隐藏链的状态代表树的拓扑结构,而第二个独立隐藏链的状态代表分支长度的不同全局缩放因子。推理是根据分层贝叶斯模型进行的,其中参数和隐藏状态通过吉布斯采样从后验分布中采样。

结果

我们在各种合成和真实世界的DNA序列比对上测试了所提出的模型。模拟结果表明,与标准的系统发育HMM不同,系统发育FHMM能够清楚地区分重组和速率异质性,从而避免了对虚假重组区域的预测。

可用性

所提出的方法已在一个扩展了凯文·墨菲的HMM工具箱的MATLAB包中实现。我们研究中使用的软件和数据可从http://www.bioss.sari.ac.uk/~dirk/Supplements获取

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验