宏基因组数据生命周期：标准与最佳实践

The metagenomic data life-cycle: standards and best practices.

作者信息

Ten Hoopen Petra, Finn Robert D, Bongo Lars Ailo, Corre Erwan, Fosso Bruno, Meyer Folker, Mitchell Alex, Pelletier Eric, Pesole Graziano, Santamaria Monica, Willassen Nils Peder, Cochrane Guy

机构信息

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom.

UiT The Arctic University of Norway, Tromsø N-9037, Norway.

出版信息

Gigascience. 2017 Aug 1;6(8):1-11. doi: 10.1093/gigascience/gix047.

DOI:10.1093/gigascience/gix047

PMID:28637310

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5737865/

Abstract

Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine research, we summarize essential variables used to describe material sampling processes and sequencing procedures in a metagenomics experiment. These aspects of metagenomics dataset generation have been to some extent addressed by the scientific community, but greater awareness and adoption is still needed. We emphasize the lack of standards relating to reporting how metagenomics datasets are analysed and how the metagenomics data analysis outputs should be archived and published. We propose best practice as a foundation for a community standard to enable reproducibility and better sharing of metagenomics datasets, leading ultimately to greater metagenomics data reuse and repurposing.

摘要

只有当分析工作流程以统一的方式进行描述时，来自独立研究的宏基因组学数据分析结果才能进行比较。在本综述中，我们梳理了可用于描述宏基因组学关键步骤的数据标准情况：（i）样本采集，（ii）样本测序，（iii）数据分析，以及（iv）数据存档与发布。以海洋研究为例，我们总结了用于描述宏基因组学实验中样本采集过程和测序程序的关键变量。科学界在一定程度上已经关注到宏基因组学数据集生成的这些方面，但仍需要更高的关注度和采用率。我们强调，在报告宏基因组学数据集的分析方式以及宏基因组学数据分析结果应如何存档和发布方面，缺乏相关标准。我们提出将最佳实践作为社区标准的基础，以实现宏基因组学数据集的可重复性和更好的共享，最终促进宏基因组学数据的更多复用和重新利用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6149/5737865/a03a064ad8da/gix047fig1.jpg

相似文献

The metagenomic data life-cycle: standards and best practices.宏基因组数据生命周期：标准与最佳实践

Gigascience. 2017 Aug 1;6(8):1-11. doi: 10.1093/gigascience/gix047.

Bioinformatics for NGS-based metagenomics and the application to biogas research.基于 NGS 的宏基因组学的生物信息学及其在沼气研究中的应用。

J Biotechnol. 2017 Nov 10;261:10-23. doi: 10.1016/j.jbiotec.2017.08.012. Epub 2017 Aug 18.

Viral Metagenomics in the Clinical Realm: Lessons Learned from a Swiss-Wide Ring Trial.病毒宏基因组学在临床领域的应用：来自瑞士全环试验的经验教训。

Genes (Basel). 2019 Aug 28;10(9):655. doi: 10.3390/genes10090655.

Identifying biologically relevant differences between metagenomic communities.鉴定宏基因组群落间具有生物学意义的差异。

Bioinformatics. 2010 Mar 15;26(6):715-21. doi: 10.1093/bioinformatics/btq041. Epub 2010 Feb 3.

Facilitating accessible, rapid, and appropriate processing of ancient metagenomic data with AMDirT.使用 AMDirT 促进古代宏基因组数据的可访问、快速和适当处理。

F1000Res. 2024 May 28;12:926. doi: 10.12688/f1000research.134798.2. eCollection 2023.

A call for standardized classification of metagenome projects.呼吁对宏基因组项目进行标准化分类。

Environ Microbiol. 2010 Jul;12(7):1803-5. doi: 10.1111/j.1462-2920.2010.02270.x.

Computational workflow for the fine-grained analysis of metagenomic samples.宏基因组样本细粒度分析的计算工作流程。

BMC Genomics. 2016 Oct 25;17(Suppl 8):802. doi: 10.1186/s12864-016-3063-x.

COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets.认知器：宏基因组数据集功能注释框架

PLoS One. 2015 Nov 11;10(11):e0142102. doi: 10.1371/journal.pone.0142102. eCollection 2015.

Interrogating the microbiome: experimental and computational considerations in support of study reproducibility.探究微生物组：支持研究可重复性的实验和计算考虑因素。

Drug Discov Today. 2018 Sep;23(9):1644-1657. doi: 10.1016/j.drudis.2018.06.005. Epub 2018 Jun 8.

Multiple Data Analyses and Statistical Approaches for Analyzing Data from Metagenomic Studies and Clinical Trials.用于分析宏基因组学研究和临床试验数据的多种数据分析与统计方法。

Methods Mol Biol. 2019;1910:605-634. doi: 10.1007/978-1-4939-9074-0_20.

引用本文的文献

Can Metagenomic Analyses Be Used Effectively in Safe Food Production?宏基因组分析能否有效地应用于安全食品生产？

Food Sci Nutr. 2025 Aug 7;13(8):e70772. doi: 10.1002/fsn3.70772. eCollection 2025 Aug.

Analysis of metagenomic data.宏基因组数据的分析

Nat Rev Methods Primers. 2025;5. doi: 10.1038/s43586-024-00376-6. Epub 2025 Jan 23.

Integrating patient metadata and pathogen genomic data: advancing pandemic preparedness with a multi-parametric simulator.整合患者元数据和病原体基因组数据：利用多参数模拟器提升大流行防范能力。

BMC Res Notes. 2025 Apr 15;18(1):174. doi: 10.1186/s13104-025-07207-1.

Discovery of robust and highly specific microbiome signatures of non-alcoholic fatty liver disease.发现非酒精性脂肪性肝病强大且高度特异的微生物组特征。

Microbiome. 2025 Jan 14;13(1):10. doi: 10.1186/s40168-024-01990-y.

Meiofauna at a tropical sandy beach in the SW Atlantic: the influence of seasonality on diversity.西南大西洋热带沙滩上的小型底栖动物：季节性对多样性的影响。

PeerJ. 2024 Jul 12;12:e17727. doi: 10.7717/peerj.17727. eCollection 2024.

Enhancing Clinical Utility: Utilization of International Standards and Guidelines for Metagenomic Sequencing in Infectious Disease Diagnosis.提高临床实用性：在传染病诊断中应用宏基因组测序的国际标准和指南。

Int J Mol Sci. 2024 Mar 15;25(6):3333. doi: 10.3390/ijms25063333.

A comprehensive overview of microbiome data in the light of machine learning applications: categorization, accessibility, and future directions.基于机器学习应用的微生物组数据综合概述：分类、可及性及未来方向。

Front Microbiol. 2024 Feb 13;15:1343572. doi: 10.3389/fmicb.2024.1343572. eCollection 2024.

Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing.微生物组的社区规模模型：代谢建模和宏基因组测序的阐明。

Microb Biotechnol. 2024 Jan;17(1):e14396. doi: 10.1111/1751-7915.14396. Epub 2024 Jan 20.

Interactive Web-Based Services for Metagenomic Data Analysis and Comparisons.交互式基于网络的宏基因组数据分析和比较服务。

Methods Mol Biol. 2023;2649:133-174. doi: 10.1007/978-1-0716-3072-3_7.

CAMP: A modular metagenomics analysis system for integrated multi-step data exploration.CAMP：一个用于集成多步骤数据探索的模块化宏基因组学分析系统。

bioRxiv. 2024 Sep 14:2023.04.09.536171. doi: 10.1101/2023.04.09.536171.

本文引用的文献

European Nucleotide Archive in 2016.2016年的欧洲核苷酸档案库。

Nucleic Acids Res. 2017 Jan 4;45(D1):D32-D36. doi: 10.1093/nar/gkw1106. Epub 2016 Nov 29.

A review of bioinformatic pipeline frameworks.生物信息学流程框架综述。

Brief Bioinform. 2017 May 1;18(3):530-536. doi: 10.1093/bib/bbw020.

The FAIR Guiding Principles for scientific data management and stewardship.科学数据管理和保存的 FAIR 指导原则。

Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18.

The standard operating procedure of the DOE-JGI Metagenome Annotation Pipeline (MAP v.4).美国能源部联合基因组研究所宏基因组注释管道（MAP v.4）的标准操作程序。

Stand Genomic Sci. 2016 Feb 24;11:17. doi: 10.1186/s40793-016-0138-x. eCollection 2016.

The Pfam protein families database: towards a more sustainable future.Pfam蛋白质家族数据库：迈向更可持续的未来。

Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.

The International Nucleotide Sequence Database Collaboration.国际核苷酸序列数据库协作组织。

Nucleic Acids Res. 2016 Jan 4;44(D1):D48-50. doi: 10.1093/nar/gkv1323. Epub 2015 Dec 10.

EBI metagenomics in 2016--an expanding and evolving resource for the analysis and archiving of metagenomic data.2016年的欧洲生物信息研究所宏基因组学——一个用于宏基因组数据分析与存档的不断扩展和发展的资源库。

Nucleic Acids Res. 2016 Jan 4;44(D1):D595-603. doi: 10.1093/nar/gkv1195. Epub 2015 Nov 17.

Marine microbial biodiversity, bioinformatics and biotechnology (M2B3) data reporting and service standards.海洋微生物多样性、生物信息学与生物技术（M2B3）数据报告及服务标准

Stand Genomic Sci. 2015 May 8;10:20. doi: 10.1186/s40793-015-0001-5. eCollection 2015.

UProC: tools for ultra-fast protein domain classification.UProC：超快速蛋白质结构域分类工具

Bioinformatics. 2015 May 1;31(9):1382-8. doi: 10.1093/bioinformatics/btu843. Epub 2014 Dec 23.

Reagent and laboratory contamination can critically impact sequence-based microbiome analyses.试剂和实验室污染会严重影响基于序列的微生物组分析。

BMC Biol. 2014 Nov 12;12:87. doi: 10.1186/s12915-014-0087-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

宏基因组数据生命周期：标准与最佳实践

The metagenomic data life-cycle: standards and best practices.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献