Department of Environmental Microbiology, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Saxony 04318, Germany.
Department of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony 04107, Germany.
Nucleic Acids Res. 2020 Jan 8;48(D1):D626-D632. doi: 10.1093/nar/gkz994.
Microbiome studies focused on the genetic potential of microbial communities (metagenomics) became standard within microbial ecology. MG-RAST and the Sequence Read Archive (SRA), the two main metagenome repositories, contain over 202 858 public available metagenomes and this number has increased exponentially. However, mining databases can be challenging due to misannotated, misleading and decentralized data. The main goal of TerrestrialMetagenomeDB is to make it easier for scientists to find terrestrial metagenomes of interest that could be compared with novel datasets in meta-analyses. We defined terrestrial metagenomes as those that do not belong to marine environments. Further, we curated the database using text mining to assign potential descriptive keywords that better contextualize environmental aspects of terrestrial metagenomes, such as biomes and materials. TerrestrialMetagenomeDB release 1.0 includes 15 022 terrestrial metagenomes from SRA and MG-RAST. Together, the downloadable data amounts to 68 Tbp. In total, 199 terrestrial terms were divided into 14 categories. These metagenomes span 83 countries, 30 biomes and 7 main source materials. The TerrestrialMetagenomeDB is publicly available at https://webapp.ufz.de/tmdb.
微生物组研究集中在微生物群落的遗传潜力上(宏基因组学),已成为微生物生态学的标准。MG-RAST 和 Sequence Read Archive(SRA)是两个主要的宏基因组数据库,其中包含超过 202,858 个公共可用的宏基因组,这个数字呈指数级增长。然而,由于注释错误、误导和分散的数据,挖掘数据库可能具有挑战性。TerrestrialMetagenomeDB 的主要目标是使科学家更容易找到他们感兴趣的陆地宏基因组,以便在元分析中与新的数据集进行比较。我们将陆地宏基因组定义为不属于海洋环境的宏基因组。此外,我们使用文本挖掘来整理数据库,为陆地宏基因组的环境方面分配潜在的描述性关键字,例如生物群落和材料。TerrestrialMetagenomeDB 1.0 版包含来自 SRA 和 MG-RAST 的 15022 个陆地宏基因组。这些可下载的数据总量达到 68 Tbp。总共 199 个陆地术语分为 14 类。这些宏基因组涵盖 83 个国家、30 个生物群落和 7 种主要的源材料。TerrestrialMetagenomeDB 可在 https://webapp.ufz.de/tmdb 上公开获取。