U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA.
Database (Oxford). 2023 Feb 16;2023. doi: 10.1093/database/baad001.
The power of next-generation sequencing has resulted in an explosive growth in the number of projects aiming to understand the metagenomic diversity of complex microbial environments. The interdisciplinary nature of this microbiome research community, along with the absence of reporting standards for microbiome data and samples, poses a significant challenge for follow-up studies. Commonly used names of metagenomes and metatranscriptomes in public databases currently lack the essential information necessary to accurately describe and classify the underlying samples, which makes a comparative analysis difficult to conduct and often results in misclassified sequences in data repositories. The Genomes OnLine Database (GOLD) (https:// gold.jgi.doe.gov/) at the Department of Energy Joint Genome Institute has been at the forefront of addressing this challenge by developing a standardized nomenclature system for naming microbiome samples. GOLD, currently in its twenty-fifth anniversary, continues to enrich the research community with hundreds of thousands of metagenomes and metatranscriptomes with well-curated and easy-to-understand names. Through this manuscript, we describe the overall naming process that can be easily adopted by researchers worldwide. Additionally, we propose the use of this naming system as a best practice for the scientific community to facilitate better interoperability and reusability of microbiome data.
下一代测序技术的发展使得旨在理解复杂微生物环境宏基因组多样性的项目数量呈爆炸式增长。这个微生物组研究社区的跨学科性质,以及缺乏微生物组数据和样本的报告标准,对后续研究构成了重大挑战。目前,公共数据库中常用的宏基因组和宏转录组名称缺乏准确描述和分类基础样本所需的基本信息,这使得比较分析难以进行,并且经常导致数据存储库中序列分类错误。能源部联合基因组研究所的基因组在线数据库(GOLD)(https://gold.jgi.doe.gov/)在解决这一挑战方面处于领先地位,它开发了一种标准化的命名系统,用于命名微生物组样本。GOLD 目前已经进入第二十五个年头,通过提供数以十万计精心整理、易于理解的名称的宏基因组和宏转录组,不断丰富着研究社区。通过本文,我们描述了一个全球研究人员都可以轻松采用的整体命名流程。此外,我们建议科学界采用这种命名系统作为最佳实践,以促进微生物组数据的更好互操作性和可重用性。