Suppr超能文献

用于宏基因组读分配的伪比对。

Pseudoalignment for metagenomic read assignment.

机构信息

Department of Molecular and Cell Biology, UC Berkeley, Berkeley, CA, USA.

Department of Genetics, Stanford University, Stanford, CA, USA.

出版信息

Bioinformatics. 2017 Jul 15;33(14):2082-2088. doi: 10.1093/bioinformatics/btx106.

Abstract

MOTIVATION

Read assignment is an important first step in many metagenomic analysis workflows, providing the basis for identification and quantification of species. However ambiguity among the sequences of many strains makes it difficult to assign reads at the lowest level of taxonomy, and reads are typically assigned to taxonomic levels where they are unambiguous. We explore connections between metagenomic read assignment and the quantification of transcripts from RNA-Seq data in order to develop novel methods for rapid and accurate quantification of metagenomic strains.

RESULTS

We find that the recent idea of pseudoalignment introduced in the RNA-Seq context is highly applicable in the metagenomics setting. When coupled with the Expectation-Maximization (EM) algorithm, reads can be assigned far more accurately and quickly than is currently possible with state of the art software, making it possible and practical for the first time to analyze abundances of individual genomes in metagenomics projects.

AVAILABILITY AND IMPLEMENTATION

Pipeline and analysis code can be downloaded from http://github.com/pachterlab/metakallisto.

CONTACT

lpachter@math.berkeley.edu.

摘要

动机

读分配是许多宏基因组分析工作流程中的重要第一步,为物种的鉴定和定量提供了基础。然而,许多菌株的序列之间存在歧义,使得在最低分类学水平上难以分配读取,并且读取通常被分配到没有歧义的分类学水平。我们探索了宏基因组读取分配与 RNA-Seq 数据中转录物定量之间的联系,以便开发用于快速准确定量宏基因组菌株的新方法。

结果

我们发现,在 RNA-Seq 上下文中引入的伪比对的最新思想在宏基因组学环境中具有高度的适用性。当与期望最大化 (EM) 算法结合使用时,与当前最先进的软件相比,读取可以更准确和快速地分配,从而首次有可能在宏基因组学项目中分析单个基因组的丰度。

可用性和实现

管道和分析代码可从 http://github.com/pachterlab/metakallisto 下载。

联系

lpachter@math.berkeley.edu

相似文献

1
Pseudoalignment for metagenomic read assignment.用于宏基因组读分配的伪比对。
Bioinformatics. 2017 Jul 15;33(14):2082-2088. doi: 10.1093/bioinformatics/btx106.

引用本文的文献

2
Scaling laws of bacterial and archaeal plasmids.细菌和古菌质粒的标度律。
Nat Commun. 2025 Jul 2;16(1):6023. doi: 10.1038/s41467-025-61205-2.

本文引用的文献

2
Near-optimal probabilistic RNA-seq quantification.近乎最优的概率 RNA-seq 定量。
Nat Biotechnol. 2016 May;34(5):525-7. doi: 10.1038/nbt.3519. Epub 2016 Apr 4.
5
Ensembl Genomes 2016: more genomes, more complexity.《Ensembl基因组2016:更多基因组,更多复杂性》
Nucleic Acids Res. 2016 Jan 4;44(D1):D574-80. doi: 10.1093/nar/gkv1209. Epub 2015 Nov 17.
6
Insights from 20 years of bacterial genome sequencing.20 年细菌基因组测序的启示。
Funct Integr Genomics. 2015 Mar;15(2):141-61. doi: 10.1007/s10142-015-0433-4. Epub 2015 Feb 27.
7
Trimmomatic: a flexible trimmer for Illumina sequence data.Trimmomatic:一款适用于 Illumina 测序数据的灵活修剪工具。
Bioinformatics. 2014 Aug 1;30(15):2114-20. doi: 10.1093/bioinformatics/btu170. Epub 2014 Apr 1.
9
Differential abundance analysis for microbial marker-gene surveys.微生物标记基因调查的差异丰度分析。
Nat Methods. 2013 Dec;10(12):1200-2. doi: 10.1038/nmeth.2658. Epub 2013 Sep 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验