Suppr超能文献

GenNon-h:在非同源系统发育树上生成多重序列比对。

GenNon-h: generating multiple sequence alignments on nonhomogeneous phylogenetic trees.

机构信息

Centre for Genomic Regulation, Dr. Aiguader 88, 08003 Barcelona, Spain.

出版信息

BMC Bioinformatics. 2012 Aug 28;13:216. doi: 10.1186/1471-2105-13-216.

Abstract

BACKGROUND

A number of software packages are available to generate DNA multiple sequence alignments (MSAs) evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts to the time-reversible models and it is not optimized to generate nonhomogeneous data (i.e. placing distinct substitution rates at different lineages).

RESULTS

We present the first package designed to generate MSAs evolving under discrete-time Markov processes on phylogenetic trees, directly from probability substitution matrices. Based on the input model and a phylogenetic tree in the Newick format (with branch lengths measured as the expected number of substitutions per site), the algorithm produces DNA alignments of desired length. GenNon-h is publicly available for download.

CONCLUSION

The software presented here is an efficient tool to generate DNA MSAs on a given phylogenetic tree. GenNon-h provides the user with the nonstationary or nonhomogeneous phylogenetic data that is well suited for testing complex biological hypotheses, exploring the limits of the reconstruction algorithms and their robustness to such models.

摘要

背景

有许多软件包可用于在系统发育树上的连续时间马尔可夫过程下生成 DNA 多序列比对 (MSA)。另一方面,不存在直接从转移矩阵模拟 DNA MSA 的方法。此外,现有的软件仅限于时间可逆模型,并且没有针对生成非均匀数据(即在不同谱系上放置不同的替代率)进行优化。

结果

我们提出了第一个设计用于直接从概率替代矩阵在系统发育树上的离散时间马尔可夫过程下生成 MSA 的软件包。基于输入模型和以 Newick 格式表示的系统发育树(分支长度表示为每个位点的预期替换数),该算法生成所需长度的 DNA 比对。GenNon-h 可公开下载。

结论

这里介绍的软件是在给定系统发育树上生成 DNA MSA 的有效工具。GenNon-h 为用户提供了适合测试复杂生物学假设、探索重建算法的极限及其对这些模型的稳健性的非平稳或非均匀系统发育数据。

相似文献

1
GenNon-h: generating multiple sequence alignments on nonhomogeneous phylogenetic trees.
BMC Bioinformatics. 2012 Aug 28;13:216. doi: 10.1186/1471-2105-13-216.
2
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
4
SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W29-34. doi: 10.1093/nar/gkq298. Epub 2010 Apr 29.
5
Bayesian coestimation of phylogeny and sequence alignment.
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
7
On the quality of tree-based protein classification.
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
8
An alignment confidence score capturing robustness to guide tree uncertainty.
Mol Biol Evol. 2010 Aug;27(8):1759-67. doi: 10.1093/molbev/msq066. Epub 2010 Mar 5.
9
StatAlign: an extendable software package for joint Bayesian estimation of alignments and evolutionary trees.
Bioinformatics. 2008 Oct 15;24(20):2403-4. doi: 10.1093/bioinformatics/btn457. Epub 2008 Aug 27.
10
Representation in stochastic search for phylogenetic tree reconstruction.
J Biomed Inform. 2006 Feb;39(1):43-50. doi: 10.1016/j.jbi.2005.11.001. Epub 2005 Nov 28.

引用本文的文献

1
EM for phylogenetic topology reconstruction on nonhomogeneous data.
BMC Evol Biol. 2014 Jun 17;14:132. doi: 10.1186/1471-2148-14-132.

本文引用的文献

1
Bayesian phylogenetics with BEAUti and the BEAST 1.7.
Mol Biol Evol. 2012 Aug;29(8):1969-73. doi: 10.1093/molbev/mss075. Epub 2012 Feb 25.
2
SPIn: model selection for phylogenetic mixtures via linear invariants.
Mol Biol Evol. 2012 Mar;29(3):929-37. doi: 10.1093/molbev/msr259. Epub 2011 Oct 17.
3
Phylogenetic invariants for the general Markov model of sequence mutation.
Math Biosci. 2003 Dec;186(2):113-44. doi: 10.1016/j.mbs.2003.08.004.
4
MrBayes 3: Bayesian phylogenetic inference under mixed models.
Bioinformatics. 2003 Aug 12;19(12):1572-4. doi: 10.1093/bioinformatics/btg180.
5
Rose: generating sequence families.
Bioinformatics. 1998;14(2):157-63. doi: 10.1093/bioinformatics/14.2.157.
6
PAML: a program package for phylogenetic analysis by maximum likelihood.
Comput Appl Biosci. 1997 Oct;13(5):555-6. doi: 10.1093/bioinformatics/13.5.555.
7
Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees.
Comput Appl Biosci. 1997 Jun;13(3):235-8. doi: 10.1093/bioinformatics/13.3.235.
8
Full reconstruction of Markov models on evolutionary trees: identifiability and consistency.
Math Biosci. 1996 Oct 1;137(1):51-73. doi: 10.1016/s0025-5564(96)00075-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验