Centre for Bioinformatics, Faculty of science and technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway.
Department of Information Technology, UiT The Arctic University of Norway, PO Box 6050 Langnes, TromsøN-9037, Norway.
Nucleic Acids Res. 2018 Jan 4;46(D1):D692-D699. doi: 10.1093/nar/gkx1036.
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/.
我们介绍了海洋数据库; MarRef、MarDB 和 MarCat(https://mmp.sfb.uit.no/databases/),这些都是公开可用的资源,可促进海洋研究和创新。这些数据资源已经在海洋宏基因组学门户 (MMP) (https://mmp.sfb.uit.no/) 中实现,是一系列丰富注释和手动整理上下文 (元数据) 和序列数据库的集合,代表了三个准确性层次。MarRef 是一个完全测序的海洋原核生物基因组数据库,代表了海洋原核生物参考基因组数据库,而 MarDB 包含所有不完全测序的原核生物基因组,无论其完整性如何。最后一个数据库 MarCat 代表了来自海洋宏基因组样本的不可培养 (和可培养) 海洋基因和蛋白质的基因 (蛋白质) 目录。MarRef 和 MarDB 的第一个版本分别包含 612 和 3726 条记录。每个记录由 106 个元数据字段组成,除了生物体和分类信息外,还包括采样、测序、组装和注释的属性。目前,MarCat 包含 1227 条记录和 55 个元数据字段。上下文数据库中使用本体和受控词汇来增强一致性。用户友好的 Web 界面允许访问者在上下文数据库中浏览、过滤和搜索,并对相应的序列数据库执行 BLAST 搜索。所有上下文和序列数据库均可从 https://s1.sfb.uit.no/public/mar/ 免费访问和下载。