利用系统发育因子隐马尔可夫模型在DNA序列比对中区分速率异质性和种间重组。

Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial hidden Markov models.

作者信息

Husmeier Dirk

机构信息

Biomathematics and Statistics, Scotland, Edinburgh, UK.

出版信息

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii166-72. doi: 10.1093/bioinformatics/bti1127.

DOI:10.1093/bioinformatics/bti1127

PMID:16204097

Abstract

MOTIVATION

A recently proposed method for detecting recombination in DNA sequence alignments is based on the combination of hidden Markov models (HMMs) with phylogenetic trees. Although this method was found to detect breakpoints of recombinant regions more accurately than most existing techniques, it inherently fails to distinguish between recombination and rate variation. In the present paper, we propose to marry the phylogenetic tree to a factorial HMM (FHMM). The states of the first hidden chain represent tree topologies, whereas the states of the second independent hidden chain represent different global scaling factors of the branch lengths. Inference is done in terms of a hierarchical Bayesian model, where parameters and hidden states are sampled from the posterior distribution with Gibbs sampling.

RESULTS

We have tested the proposed model on various synthetic and real-world DNA sequence alignments. The simulation results suggest that as opposed to the standard phylogenetic HMM, the phylogenetic FHMM clearly distinguishes between recombination and rate heterogeneity and thereby avoids the prediction of spurious recombinant regions.

AVAILABILITY

The proposed method has been implemented in a MATLAB package that extends Kevin Murphy's HMM toolbox. Software and data used in our study are available from http://www.bioss.sari.ac.uk/~dirk/Supplements

摘要

动机

最近提出的一种用于检测DNA序列比对中重组的方法是基于隐马尔可夫模型（HMM）与系统发育树的结合。尽管该方法被发现比大多数现有技术更准确地检测重组区域的断点，但它本质上无法区分重组和速率变化。在本文中，我们建议将系统发育树与因子隐马尔可夫模型（FHMM）相结合。第一个隐藏链的状态代表树的拓扑结构，而第二个独立隐藏链的状态代表分支长度的不同全局缩放因子。推理是根据分层贝叶斯模型进行的，其中参数和隐藏状态通过吉布斯采样从后验分布中采样。

结果

我们在各种合成和真实世界的DNA序列比对上测试了所提出的模型。模拟结果表明，与标准的系统发育HMM不同，系统发育FHMM能够清楚地区分重组和速率异质性，从而避免了对虚假重组区域的预测。

可用性

所提出的方法已在一个扩展了凯文·墨菲的HMM工具箱的MATLAB包中实现。我们研究中使用的软件和数据可从http://www.bioss.sari.ac.uk/~dirk/Supplements获取

相似文献

Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial hidden Markov models.

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii166-72. doi: 10.1093/bioinformatics/bti1127.

Detecting interspecific recombination with a pruned probabilistic divergence measure.

Bioinformatics. 2005 May 1;21(9):1797-806. doi: 10.1093/bioinformatics/bti151. Epub 2004 Nov 30.

GARD: a genetic algorithm for recombination detection.

Bioinformatics. 2006 Dec 15;22(24):3096-8. doi: 10.1093/bioinformatics/btl474. Epub 2006 Nov 16.

A heuristic Bayesian method for segmenting DNA sequence alignments and detecting evidence for recombination and gene conversion.

Stat Appl Genet Mol Biol. 2006;5:Article27. doi: 10.2202/1544-6115.1238. Epub 2006 Oct 24.

Robust inference of positive selection from recombining coding sequences.

Bioinformatics. 2006 Oct 15;22(20):2493-9. doi: 10.1093/bioinformatics/btl427. Epub 2006 Aug 7.

Computing recombination networks from binary sequences.

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii159-65. doi: 10.1093/bioinformatics/bti1126.

SlidingBayes: exploring recombination using a sliding window approach based on Bayesian phylogenetic inference.

Bioinformatics. 2005 Apr 1;21(7):1274-5. doi: 10.1093/bioinformatics/bti139. Epub 2004 Nov 16.

Enhancing the quality of phylogenetic analysis using fuzzy hidden Markov model alignments.

Stud Health Technol Inform. 2007;129(Pt 2):1245-9.

A gamma mixture model better accounts for among site rate heterogeneity.

Bioinformatics. 2005 Sep 1;21 Suppl 2:ii151-8. doi: 10.1093/bioinformatics/bti1125.

Dual multiple change-point model leads to more accurate recombination detection.

Bioinformatics. 2005 Jul 1;21(13):3034-42. doi: 10.1093/bioinformatics/bti459. Epub 2005 May 24.

引用本文的文献

Clinical and molecular characteristics of carbapenem non-susceptible Escherichia coli: A nationwide survey from Oman.

PLoS One. 2020 Oct 9;15(10):e0239924. doi: 10.1371/journal.pone.0239924. eCollection 2020.

Efficient Inference of Recent and Ancestral Recombination within Bacterial Populations.

Mol Biol Evol. 2017 May 1;34(5):1167-1182. doi: 10.1093/molbev/msx066.

Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins.

Nucleic Acids Res. 2015 Feb 18;43(3):e15. doi: 10.1093/nar/gku1196. Epub 2014 Nov 20.

rbrothers: R Package for Bayesian Multiple Change-Point Recombination Detection.

Evol Bioinform Online. 2013 Jun 12;9:235-8. doi: 10.4137/EBO.S11945. Print 2013.

Evidence of animal mtDNA recombination between divergent populations of the potato cyst nematode Globodera pallida.

Genetica. 2012 Mar;140(1-3):19-29. doi: 10.1007/s10709-012-9651-z. Epub 2012 May 11.

Detection of recombination events in bacterial genomes from large population samples.

Nucleic Acids Res. 2012 Jan;40(1):e6. doi: 10.1093/nar/gkr928. Epub 2011 Nov 7.

Detecting phylogenetic breakpoints and discordance from genome-wide alignments for species tree reconstruction.

Genome Biol Evol. 2011;3:246-58. doi: 10.1093/gbe/evr013. Epub 2011 Feb 28.

Evaluation of methods for detecting conversion events in gene clusters.

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S45. doi: 10.1186/1471-2105-12-S1-S45.

A mixture model and a hidden markov model to simultaneously detect recombination breakpoints and reconstruct phylogenies.

Evol Bioinform Online. 2009 Jun 25;5:67-79. doi: 10.4137/ebo.s2242.

Ancestral population genomics: the coalescent hidden Markov model approach.

Genetics. 2009 Sep;183(1):259-74. doi: 10.1534/genetics.109.103010. Epub 2009 Jul 6.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用系统发育因子隐马尔可夫模型在DNA序列比对中区分速率异质性和种间重组。

Discriminating between rate heterogeneity and interspecific recombination in DNA sequence alignments with phylogenetic factorial hidden Markov models.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献