• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MEDUSA:一种用于宏基因组鸟枪法序列的灵敏分类学分类和灵活功能注释的流程。

MEDUSA: A Pipeline for Sensitive Taxonomic Classification and Flexible Functional Annotation of Metagenomic Shotgun Sequences.

作者信息

Morais Diego A A, Cavalcante João V F, Monteiro Shênia S, Pasquali Matheus A B, Dalmolin Rodrigo J S

机构信息

Bioinformatics Multidisciplinary Environment, Federal University of Rio Grande do Norte, Natal, Brazil.

Graduate Program in Engineering and Natural Resources Management, Federal University of Campina Grande, Campina Grande, Brazil.

出版信息

Front Genet. 2022 Mar 7;13:814437. doi: 10.3389/fgene.2022.814437. eCollection 2022.

DOI:10.3389/fgene.2022.814437
PMID:35330728
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8940201/
Abstract

Metagenomic studies unravel details about the taxonomic composition and the functions performed by microbial communities. As a complete metagenomic analysis requires different tools for different purposes, the selection and setup of these tools remain challenging. Furthermore, the chosen toolset will affect the accuracy, the formatting, and the functional identifiers reported in the results, impacting the results interpretation and the biological answer obtained. Thus, we surveyed state-of-the-art tools available in the literature, created simulated datasets, and performed benchmarks to design a sensitive and flexible metagenomic analysis pipeline. Here we present MEDUSA, an efficient pipeline to conduct comprehensive metagenomic analyses. It performs preprocessing, assembly, alignment, taxonomic classification, and functional annotation on shotgun data, supporting user-built dictionaries to transfer annotations to any functional identifier. MEDUSA includes several tools, as fastp, Bowtie2, DIAMOND, Kaiju, MEGAHIT, and a novel tool implemented in Python to transfer annotations to BLAST/DIAMOND alignment results. These tools are installed via Conda, and the workflow is managed by Snakemake, easing the setup and execution. Compared with MEGAN 6 Community Edition, MEDUSA correctly identifies more species, especially the less abundant, and is more suited for functional analysis using Gene Ontology identifiers.

摘要

宏基因组学研究揭示了微生物群落的分类组成和所执行功能的细节。由于完整的宏基因组分析针对不同目的需要不同工具,因此这些工具的选择和设置仍然具有挑战性。此外,所选的工具集将影响结果中报告的准确性、格式和功能标识符,从而影响结果解释和获得的生物学答案。因此,我们调查了文献中可用的最新工具,创建了模拟数据集,并进行了基准测试,以设计一个灵敏且灵活的宏基因组分析流程。在此,我们展示了MEDUSA,这是一个用于进行全面宏基因组分析的高效流程。它对鸟枪法数据执行预处理、组装、比对、分类学分类和功能注释,支持用户构建的字典将注释转换为任何功能标识符。MEDUSA包括多个工具,如fastp、Bowtie2、DIAMOND、Kaiju、MEGAHIT,以及一个用Python实现的将注释转换为BLAST/DIAMOND比对结果的新工具。这些工具通过Conda安装,工作流程由Snakemake管理,简化了设置和执行过程。与MEGAN 6社区版相比,MEDUSA能正确识别更多物种,尤其是丰度较低的物种,并且更适合使用基因本体标识符进行功能分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/10da07fcba1e/fgene-13-814437-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/76145ff079e4/fgene-13-814437-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/80daa4173222/fgene-13-814437-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/298cff097503/fgene-13-814437-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/fdd650027a1e/fgene-13-814437-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/0b9689aa9215/fgene-13-814437-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/10da07fcba1e/fgene-13-814437-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/76145ff079e4/fgene-13-814437-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/80daa4173222/fgene-13-814437-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/298cff097503/fgene-13-814437-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/fdd650027a1e/fgene-13-814437-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/0b9689aa9215/fgene-13-814437-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0dc7/8940201/10da07fcba1e/fgene-13-814437-g006.jpg

相似文献

1
MEDUSA: A Pipeline for Sensitive Taxonomic Classification and Flexible Functional Annotation of Metagenomic Shotgun Sequences.MEDUSA:一种用于宏基因组鸟枪法序列的灵敏分类学分类和灵活功能注释的流程。
Front Genet. 2022 Mar 7;13:814437. doi: 10.3389/fgene.2022.814437. eCollection 2022.
2
MetaLAFFA: a flexible, end-to-end, distributed computing-compatible metagenomic functional annotation pipeline.MetaLAFFA:一个灵活的、端到端的、分布式计算兼容的宏基因组功能注释管道。
BMC Bioinformatics. 2020 Oct 21;21(1):471. doi: 10.1186/s12859-020-03815-9.
3
Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments.Sunbeam:用于分析宏基因组测序实验的可扩展流水线。
Microbiome. 2019 Mar 22;7(1):46. doi: 10.1186/s40168-019-0658-x.
4
Introduction to the Analysis of Environmental Sequences: Metagenomics with MEGAN.环境序列分析导论:使用MEGAN进行宏基因组学分析
Methods Mol Biol. 2019;1910:591-604. doi: 10.1007/978-1-4939-9074-0_19.
5
ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data.ATLAS:用于宏基因组序列数据组装、注释和基因组分箱的 SnakeMake 工作流程。
BMC Bioinformatics. 2020 Jun 22;21(1):257. doi: 10.1186/s12859-020-03585-4.
6
MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function.MG-RAST,一种用于分析微生物群落结构和功能的宏基因组学服务。
Methods Mol Biol. 2016;1399:207-33. doi: 10.1007/978-1-4939-3369-3_13.
7
Taxonomic classification of metagenomic shotgun sequences with CARMA3.基于 CARMA3 的宏基因组鸟枪法测序的分类学分类
Nucleic Acids Res. 2011 Aug;39(14):e91. doi: 10.1093/nar/gkr225. Epub 2011 May 17.
8
Accurate and fast estimation of taxonomic profiles from metagenomic shotgun sequences.从宏基因组鸟枪法序列中准确快速地估算分类分布情况。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S4. doi: 10.1186/1471-2164-12-S2-S4. Epub 2011 Jul 27.
9
Construction of customized sub-databases from NCBI-nr database for rapid annotation of huge metagenomic datasets using a combined BLAST and MEGAN approach.利用组合 BLAST 和 MEGAN 方法从 NCBI-nr 数据库构建定制子数据库,快速注释大量宏基因组数据集。
PLoS One. 2013;8(4):e59831. doi: 10.1371/journal.pone.0059831. Epub 2013 Apr 1.
10
TheViral MetaGenome Annotation Pipeline(VMGAP):an automated tool for the functional annotation of viral Metagenomic shotgun sequencing data.病毒宏基因组注释流程(VMGAP):一种用于病毒宏基因组鸟枪法测序数据功能注释的自动化工具。
Stand Genomic Sci. 2011 Jul 1;4(3):418-29. doi: 10.4056/sigs.1694706. Epub 2011 Jun 30.

引用本文的文献

1
Comprehensive genomics, probiotic, and antibiofilm potential analysis of Streptococcus thermophilus strains isolated from homemade and commercial dahi.对从自制和市售达希中分离出的嗜热链球菌菌株进行全面的基因组学、益生菌和抗生物膜潜力分析。
Sci Rep. 2025 Feb 27;15(1):7089. doi: 10.1038/s41598-025-90999-w.
2
Mock community taxonomic classification performance of publicly available shotgun metagenomics pipelines.模拟公开可用的高通量宏基因组学分析流程的群落分类性能。
Sci Data. 2024 Jan 17;11(1):81. doi: 10.1038/s41597-023-02877-7.
3
Metagenomic Analyses Reveal the Influence of Depth Layers on Marine Biodiversity on Tropical and Subtropical Regions.

本文引用的文献

1
Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit.教程:使用 CAMI 基准测试工具包评估宏基因组学软件。
Nat Protoc. 2021 Apr;16(4):1785-1801. doi: 10.1038/s41596-020-00480-3. Epub 2021 Mar 1.
2
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
3
Improved metagenomic analysis with Kraken 2.Kraken 2 提升宏基因组分析。
宏基因组分析揭示深度分层对热带和亚热带地区海洋生物多样性的影响。
Microorganisms. 2023 Jun 27;11(7):1668. doi: 10.3390/microorganisms11071668.
Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.
4
An Integrated Pipeline for Annotation and Visualization of Metagenomic Contigs.一种用于宏基因组重叠群注释和可视化的集成流程
Front Genet. 2019 Oct 15;10:999. doi: 10.3389/fgene.2019.00999. eCollection 2019.
5
Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments.Sunbeam:用于分析宏基因组测序实验的可扩展流水线。
Microbiome. 2019 Mar 22;7(1):46. doi: 10.1186/s40168-019-0658-x.
6
GenCoF: a graphical user interface to rapidly remove human genome contaminants from metagenomic datasets.GenCoF:一种图形用户界面,用于快速从宏基因组数据集中去除人类基因组污染物。
Bioinformatics. 2019 Jul 1;35(13):2318-2319. doi: 10.1093/bioinformatics/bty963.
7
fastp: an ultra-fast all-in-one FASTQ preprocessor.fastp:一个超快速的一体化 FASTQ 预处理程序。
Bioinformatics. 2018 Sep 1;34(17):i884-i890. doi: 10.1093/bioinformatics/bty560.
8
Species-level functional profiling of metagenomes and metatranscriptomes.宏基因组和宏转录组的物种水平功能分析。
Nat Methods. 2018 Nov;15(11):962-968. doi: 10.1038/s41592-018-0176-y. Epub 2018 Oct 30.
9
Simulating Illumina metagenomic data with InSilicoSeq.用 InSilicoSeq 模拟 Illumina 宏基因组数据。
Bioinformatics. 2019 Feb 1;35(3):521-522. doi: 10.1093/bioinformatics/bty630.
10
Bioconda: sustainable and comprehensive software distribution for the life sciences.生物conda:面向生命科学的可持续且全面的软件发行平台。
Nat Methods. 2018 Jul;15(7):475-476. doi: 10.1038/s41592-018-0046-7.