Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia.
Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Inhoffenstraße 7B, 38124, Brunswick, Germany.
Sci Rep. 2021 Jun 1;11(1):11511. doi: 10.1038/s41598-021-90799-y.
Exponential rise of metagenomics sequencing is delivering massive functional environmental genomics data. However, this also generates a procedural bottleneck for on-going re-analysis as reference databases grow and methods improve, and analyses need be updated for consistency, which require acceess to increasingly demanding bioinformatic and computational resources. Here, we present the KAUST Metagenomic Analysis Platform (KMAP), a new integrated open web-based tool for the comprehensive exploration of shotgun metagenomic data. We illustrate the capacities KMAP provides through the re-assembly of ~ 27,000 public metagenomic samples captured in ~ 450 studies sampled across ~ 77 diverse habitats. A small subset of these metagenomic assemblies is used in this pilot study grouped into 36 new habitat-specific gene catalogs, all based on full-length (complete) genes. Extensive taxonomic and gene annotations are stored in Gene Information Tables (GITs), a simple tractable data integration format useful for analysis through command line or for database management. KMAP pilot study provides the exploration and comparison of microbial GITs across different habitats with over 275 million genes. KMAP access to data and analyses is available at https://www.cbrc.kaust.edu.sa/aamg/kmap.start .
元基因组测序呈指数级增长,带来了海量的功能环境基因组学数据。然而,随着参考数据库的增长和方法的改进,以及为了一致性需要对分析进行更新,这也给正在进行的重新分析带来了程序上的瓶颈,这需要越来越多的生物信息学和计算资源。在这里,我们介绍了 KAUST 宏基因组分析平台(KMAP),这是一个新的集成的开放网络工具,用于全面探索 shotgun 宏基因组数据。我们通过重新组装大约 27000 个来自大约 77 个不同栖息地的大约 450 项研究中捕获的公共宏基因组样本,说明了 KMAP 提供的能力。这些宏基因组组装的一小部分被用于这个试点研究,分为 36 个新的特定于栖息地的基因目录,全部基于全长(完整)基因。广泛的分类和基因注释存储在基因信息表(GITs)中,这是一种简单易用的数据集成格式,可用于通过命令行进行分析或数据库管理。KMAP 试点研究提供了对不同栖息地微生物 GITs 的探索和比较,共有超过 2.75 亿个基因。KMAP 可以访问数据和分析,网址是 https://www.cbrc.kaust.edu.sa/aamg/kmap.start 。