一种无需使用带有条形码的混合样本短读序列构建组装的系统发育重建方法。

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes.

机构信息

The Research School of Biology, The Australian National University, ACT, Australia.

School of Biological Sciences, University of Auckland, Auckland, New Zealand.

出版信息

PLoS Comput Biol. 2021 Sep 13;17(9):e1008949. doi: 10.1371/journal.pcbi.1008949. eCollection 2021 Sep.

DOI:10.1371/journal.pcbi.1008949

PMID:34516547

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8460051/

Abstract

A current strategy for obtaining haplotype information from several individuals involves short-read sequencing of pooled amplicons, where fragments from each individual is identified by a unique DNA barcode. In this paper, we report a new method to recover the phylogeny of haplotypes from short-read sequences obtained using pooled amplicons from a mixture of individuals, without barcoding. The method, AFPhyloMix, accepts an alignment of the mixture of reads against a reference sequence, obtains the single-nucleotide-polymorphisms (SNP) patterns along the alignment, and constructs the phylogenetic tree according to the SNP patterns. AFPhyloMix adopts a Bayesian inference model to estimate the phylogeny of the haplotypes and their relative abundances, given that the number of haplotypes is known. In our simulations, AFPhyloMix achieved at least 80% accuracy at recovering the phylogenies and relative abundances of the constituent haplotypes, for mixtures with up to 15 haplotypes. AFPhyloMix also worked well on a real data set of kangaroo mitochondrial DNA sequences.

摘要

一种从多个个体中获取单倍型信息的当前策略涉及混合扩增子的短读测序，其中每个个体的片段通过独特的 DNA 条码来识别。在本文中，我们报告了一种从混合个体的混合扩增子获得的短读序列中恢复单倍型系统发育的新方法，无需条码。该方法 AFPhyloMix 接受混合物的读取与参考序列的比对，获得沿比对的单核苷酸多态性（SNP）模式，并根据 SNP 模式构建系统发育树。AFPhyloMix 采用贝叶斯推断模型来估计单倍型及其相对丰度的系统发育，假设单倍型的数量是已知的。在我们的模拟中，AFPhyloMix 在恢复组成单倍型的系统发育和相对丰度方面至少达到了 80%的准确性，对于包含多达 15 个单倍型的混合物也是如此。AFPhyloMix 在袋鼠线粒体 DNA 序列的真实数据集上也表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e87/8460051/16ec97288355/pcbi.1008949.g001.jpg

相似文献

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes.

PLoS Comput Biol. 2021 Sep 13;17(9):e1008949. doi: 10.1371/journal.pcbi.1008949. eCollection 2021 Sep.

Reassembling haplotypes in a mixture of pooled amplicons when the relative concentrations are known: A proof-of-concept study on the efficient design of next-generation sequencing strategies.

PLoS One. 2018 Apr 5;13(4):e0195090. doi: 10.1371/journal.pone.0195090. eCollection 2018.

Bayesian coestimation of phylogeny and sequence alignment.

BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

DNA barcoding and species delimitation of butterflies (Lepidoptera) from Nigeria.

Mol Biol Rep. 2020 Dec;47(12):9441-9457. doi: 10.1007/s11033-020-05984-5. Epub 2020 Nov 16.

Bayesian phylogenetic inference using DNA sequences: a Markov Chain Monte Carlo Method.

Mol Biol Evol. 1997 Jul;14(7):717-24. doi: 10.1093/oxfordjournals.molbev.a025811.

A phylogenetic approach for haplotype analysis of sequence data from complex mitochondrial mixtures.

Forensic Sci Int Genet. 2017 Sep;30:93-105. doi: 10.1016/j.fsigen.2017.05.007. Epub 2017 May 29.

Estimation of evolutionary parameters using short, random and partial sequences from mixed samples of anonymous individuals.

BMC Bioinformatics. 2015 Nov 4;16:357. doi: 10.1186/s12859-015-0810-y.

Phylogenetic MCMC algorithms are misleading on mixtures of trees.

Science. 2005 Sep 30;309(5744):2207-9. doi: 10.1126/science.1115493.

SNP barcoding based on decision tree algorithm: A new tool for identification of mosquito species with special reference to Anopheles.

Acta Trop. 2019 Nov;199:105152. doi: 10.1016/j.actatropica.2019.105152. Epub 2019 Aug 22.

Phylogenetic community ecology of soil biodiversity using mitochondrial metagenomics.

Mol Ecol. 2015 Jul;24(14):3603-17. doi: 10.1111/mec.13195. Epub 2015 May 14.

本文引用的文献

Phylogenetic network analysis of SARS-CoV-2 genomes.

Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9241-9243. doi: 10.1073/pnas.2004999117. Epub 2020 Apr 8.

ModelFinder: fast model selection for accurate phylogenetic estimates.

Nat Methods. 2017 Jun;14(6):587-589. doi: 10.1038/nmeth.4285. Epub 2017 May 8.

Haplotype-resolved genome sequencing: experimental methods and applications.

Nat Rev Genet. 2015 Jun;16(6):344-58. doi: 10.1038/nrg3903. Epub 2015 May 7.

IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.

Mol Biol Evol. 2015 Jan;32(1):268-74. doi: 10.1093/molbev/msu300. Epub 2014 Nov 3.

Trimmomatic: a flexible trimmer for Illumina sequence data.

Bioinformatics. 2014 Aug 1;30(15):2114-20. doi: 10.1093/bioinformatics/btu170. Epub 2014 Apr 1.

SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads.

Bioinformatics. 2014 Jun 15;30(12):1660-6. doi: 10.1093/bioinformatics/btu077. Epub 2014 Feb 13.

Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data.

Bioinformatics. 2012 Jun 15;28(12):1647-9. doi: 10.1093/bioinformatics/bts199. Epub 2012 Apr 27.

Molecular phylogenetics: principles and practice.

Nat Rev Genet. 2012 Mar 28;13(5):303-14. doi: 10.1038/nrg3186.

ART: a next-generation sequencing read simulator.

Bioinformatics. 2012 Feb 15;28(4):593-4. doi: 10.1093/bioinformatics/btr708. Epub 2011 Dec 23.

INDELible: a flexible simulator of biological sequence evolution.

Mol Biol Evol. 2009 Aug;26(8):1879-88. doi: 10.1093/molbev/msp098. Epub 2009 May 7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种无需使用带有条形码的混合样本短读序列构建组装的系统发育重建方法。

An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献