Suppr超能文献

蕨类植物:宏基因组学数据中物种丰度的估计

Bracken: estimating species abundance in metagenomics data.

作者信息

Lu Jennifer, Breitwieser Florian P, Thielen Peter, Salzberg Steven L

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, United States.

Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, United States.

出版信息

PeerJ Comput Sci. 2017;3. doi: 10.7717/peerj-cs.104. Epub 2017 Jan 2.

Abstract

Metagenomic experiments attempt to characterize microbial communities using high-throughput DNA sequencing. Identification of the microorganisms in a sample provides information about the genetic profile, population structure, and role of microorganisms within an environment. Until recently, most metagenomics studies focused on high-level characterization at the level of phyla, or alternatively sequenced the 16S ribosomal RNA gene that is present in bacterial species. As the cost of sequencing has fallen, though, metagenomics experiments have increasingly used unbiased shotgun sequencing to capture all the organisms in a sample. This approach requires a method for estimating abundance directly from the raw read data. Here we describe a fast, accurate new method that computes the abundance at the species level using the reads collected in a metagenomics experiment. Bracken (Bayesian Reestimation of Abundance after Classification with KrakEN) uses the taxonomic assignments made by Kraken, a very fast read-level classifier, along with information about the genomes themselves to estimate abundance at the species level, the genus level, or above. We demonstrate that Bracken can produce accurate species- and genus-level abundance estimates even when a sample contains multiple near-identical species.

摘要

宏基因组实验试图通过高通量DNA测序来表征微生物群落。识别样本中的微生物可提供有关微生物在某一环境中的遗传图谱、种群结构及作用的信息。直到最近,大多数宏基因组学研究都集中在门水平的高级表征上,或者对细菌物种中存在的16S核糖体RNA基因进行测序。然而,随着测序成本的下降,宏基因组学实验越来越多地使用无偏鸟枪法测序来捕获样本中的所有生物。这种方法需要一种直接从原始读取数据估计丰度的方法。在这里,我们描述了一种快速、准确的新方法,该方法使用宏基因组学实验中收集的读取数据来计算物种水平的丰度。Bracken(使用KrakEN分类后丰度的贝叶斯重新估计)使用非常快速的读取水平分类器KrakEN进行的分类分配,以及有关基因组本身的信息来估计物种水平、属水平或更高水平的丰度。我们证明,即使样本包含多个近乎相同的物种,Bracken也能产生准确的物种和属水平的丰度估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/87f2/12016282/599fc49a97bb/nihms-2071674-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验