无比对系统发育学与群体遗传学。

Alignment-free phylogenetics and population genetics.

作者信息

Haubold Bernhard

机构信息

Corresponding author. Bernhard Haubold.

出版信息

Brief Bioinform. 2014 May;15(3):407-18. doi: 10.1093/bib/bbt083. Epub 2013 Nov 29.

DOI:10.1093/bib/bbt083

PMID:24291823

Abstract

Phylogenetics and population genetics are central disciplines in evolutionary biology. Both are based on comparative data, today usually DNA sequences. These have become so plentiful that alignment-free sequence comparison is of growing importance in the race between scientists and sequencing machines. In phylogenetics, efficient distance computation is the major contribution of alignment-free methods. A distance measure should reflect the number of substitutions per site, which underlies classical alignment-based phylogeny reconstruction. Alignment-free distance measures are either based on word counts or on match lengths, and I apply examples of both approaches to simulated and real data to assess their accuracy and efficiency. While phylogeny reconstruction is based on the number of substitutions, in population genetics, the distribution of mutations along a sequence is also considered. This distribution can be explored by match lengths, thus opening the prospect of alignment-free population genomics.

摘要

系统发育学和群体遗传学是进化生物学的核心学科。两者都基于比较数据，如今通常是DNA序列。这些数据变得如此丰富，以至于在科学家与测序机器的竞赛中，无比对序列比较变得越来越重要。在系统发育学中，高效的距离计算是无比对方法的主要贡献。距离度量应反映每个位点的替换数，这是基于比对的经典系统发育重建的基础。无比对距离度量要么基于词计数，要么基于匹配长度，我将这两种方法的示例应用于模拟数据和实际数据，以评估它们的准确性和效率。虽然系统发育重建基于替换数，但在群体遗传学中，也会考虑突变沿序列的分布。这种分布可以通过匹配长度来探索，从而开启了无比对群体基因组学的前景。

相似文献

Alignment-free phylogenetics and population genetics.

Brief Bioinform. 2014 May;15(3):407-18. doi: 10.1093/bib/bbt083. Epub 2013 Nov 29.

Bayesian coestimation of phylogeny and sequence alignment.

BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

Integrating phylogenetics, phylogeography and population genetics through genomes and evolutionary theory.

Mol Phylogenet Evol. 2013 Dec;69(3):1172-85. doi: 10.1016/j.ympev.2013.06.006. Epub 2013 Jun 22.

A test of neutrality and constant population size based on the mismatch distribution.

Mol Biol Evol. 2004 Apr;21(4):724-31. doi: 10.1093/molbev/msh066. Epub 2004 Feb 12.

Bio++: a set of C++ libraries for sequence analysis, phylogenetics, molecular evolution and population genetics.

BMC Bioinformatics. 2006 Apr 4;7:188. doi: 10.1186/1471-2105-7-188.

Weighted relative entropy for alignment-free sequence comparison based on Markov model.

J Biomol Struct Dyn. 2011 Feb;28(4):545-55. doi: 10.1080/07391102.2011.10508594.

[Foundations of the new phylogenetics].

Zh Obshch Biol. 2004 Jul-Aug;65(4):334-66.

A novel statistical measure for sequence comparison on the basis of k-word counts.

J Theor Biol. 2013 Feb 7;318:91-100. doi: 10.1016/j.jtbi.2012.10.035. Epub 2012 Nov 9.

CONGEN-2010: A semester in a fortnight.

J Hered. 2011 Jul-Aug;102(4):494-5. doi: 10.1093/jhered/esr057.

Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

ScientificWorldJournal. 2012;2012:450124. doi: 10.1100/2012/450124. Epub 2012 Sep 10.

引用本文的文献

Energy entropy vector: a novel approach for efficient microbial genomic sequence analysis and classification.

Brief Bioinform. 2025 Sep 6;26(5). doi: 10.1093/bib/bbaf459.

K-mer-based Approaches to Bridging Pangenomics and Population Genetics.

Mol Biol Evol. 2025 Mar 5;42(3). doi: 10.1093/molbev/msaf047.

An alignment- and reference-free strategy using -mer present pattern for population genomic analyses.

Mycology. 2024 Jun 5;16(1):309-323. doi: 10.1080/21501203.2024.2358868. eCollection 2025.

An alignment-free method for phylogeny estimation using maximum likelihood.

BMC Bioinformatics. 2025 Mar 7;26(1):77. doi: 10.1186/s12859-025-06080-w.

An alignment-free method for detection of missing regions for phylogenetic analysis.

Heliyon. 2024 Jun 4;10(11):e32227. doi: 10.1016/j.heliyon.2024.e32227. eCollection 2024 Jun 15.

The determinants of the rarity of nucleic and peptide short sequences in nature.

NAR Genom Bioinform. 2024 Apr 4;6(2):lqae029. doi: 10.1093/nargab/lqae029. eCollection 2024 Jun.

Exploring the sorghum race level diversity utilizing 272 sorghum accessions genomic resources.

Front Plant Sci. 2023 Mar 17;14:1143512. doi: 10.3389/fpls.2023.1143512. eCollection 2023.

Next-generation development and application of codon model in evolution.

Front Genet. 2023 Jan 27;14:1091575. doi: 10.3389/fgene.2023.1091575. eCollection 2023.

Quantifying the uncertainty of assembly-free genome-wide distance estimates and phylogenetic relationships using subsampling.

Cell Syst. 2022 Oct 19;13(10):817-829.e3. doi: 10.1016/j.cels.2022.06.007.

Genome-wide alignment-free phylogenetic distance estimation under a no strand-bias model.

Bioinform Adv. 2022 Aug 12;2(1):vbac055. doi: 10.1093/bioadv/vbac055. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

无比对系统发育学与群体遗传学。

Alignment-free phylogenetics and population genetics.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献