Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.
World Data Center for Microorganisms, Beijing 100101, China.
Nucleic Acids Res. 2021 Jan 8;49(D1):D694-D705. doi: 10.1093/nar/gkaa957.
Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data and methods. As the depository of the Global Catalogue of Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Global Catalogue of Type Strain (gcType) has published 1049 type strain genomes sequenced by the GCM 10K project which are preserved in global culture collections with a valid published status. Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation pipelines to form a high-quality reference database. This database integrates type strain sequences with their phenotypic information to facilitate phenotypic and genotypic analyses. Multiple formats of cross-genome searches and interactive interfaces have allowed extensive exploration of the database's resources. In this study, we describe web-based data analysis pipelines for genomic analyses and genome-based taxonomy, which could serve as a one-stop platform for the identification of prokaryotic species. The number of type strain genomes that are published will continue to increase as the GCM 10K project increases its collaboration with culture collections worldwide. Data of this project is shared with the International Nucleotide Sequence Database Collaboration. Access to gcType is free at http://gctype.wdcm.org/.
基于基因组的微生物分类学和功能研究越来越依赖于数据和方法。作为全球微生物名录 10K 模式菌株测序项目的存储库,全球模式菌株名录(gcType)已发表了 1049 个由全球微生物名录 10K 项目测序的模式菌株基因组,这些菌株保存在具有有效发表状态的全球培养物保藏库中。此外,gcType 提供的信息包括通过质量控制标准和标准数据注释管道从 GenBank 中整合的超过 12000 个公开的模式菌株基因组序列,形成一个高质量的参考数据库。该数据库将模式菌株序列与其表型信息整合在一起,以促进表型和基因型分析。多种格式的跨基因组搜索和交互界面允许对数据库资源进行广泛的探索。在这项研究中,我们描述了基于网络的基因组分析和基于基因组的分类学数据分析管道,它们可以作为鉴定原核生物物种的一站式平台。随着全球微生物名录 10K 项目与全球培养物保藏库的合作不断增加,发表的模式菌株基因组数量将继续增加。该项目的数据与国际核苷酸序列数据库协作组织共享。可在 http://gctype.wdcm.org/ 免费访问 gcType。