用于访问MG-RAST微生物群落数据的RESTful API。

A RESTful API for accessing microbial community data for MG-RAST.

作者信息

Wilke Andreas, Bischof Jared, Harrison Travis, Brettin Tom, D'Souza Mark, Gerlach Wolfgang, Matthews Hunter, Paczian Tobias, Wilkening Jared, Glass Elizabeth M, Desai Narayan, Meyer Folker

机构信息

Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, Illinois, United States of America; Computation Institute, University of Chicago, Chicago, Illinois, United States of America.

Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, Illinois, United States of America.

出版信息

PLoS Comput Biol. 2015 Jan 8;11(1):e1004008. doi: 10.1371/journal.pcbi.1004008. eCollection 2015 Jan.

DOI:10.1371/journal.pcbi.1004008

PMID:25569221

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4287624/

Abstract

Metagenomic sequencing has produced significant amounts of data in recent years. For example, as of summer 2013, MG-RAST has been used to annotate over 110,000 data sets totaling over 43 Terabases. With metagenomic sequencing finding even wider adoption in the scientific community, the existing web-based analysis tools and infrastructure in MG-RAST provide limited capability for data retrieval and analysis, such as comparative analysis between multiple data sets. Moreover, although the system provides many analysis tools, it is not comprehensive. By opening MG-RAST up via a web services API (application programmers interface) we have greatly expanded access to MG-RAST data, as well as provided a mechanism for the use of third-party analysis tools with MG-RAST data. This RESTful API makes all data and data objects created by the MG-RAST pipeline accessible as JSON objects. As part of the DOE Systems Biology Knowledgebase project (KBase, http://kbase.us) we have implemented a web services API for MG-RAST. This API complements the existing MG-RAST web interface and constitutes the basis of KBase's microbial community capabilities. In addition, the API exposes a comprehensive collection of data to programmers. This API, which uses a RESTful (Representational State Transfer) implementation, is compatible with most programming environments and should be easy to use for end users and third parties. It provides comprehensive access to sequence data, quality control results, annotations, and many other data types. Where feasible, we have used standards to expose data and metadata. Code examples are provided in a number of languages both to show the versatility of the API and to provide a starting point for users. We present an API that exposes the data in MG-RAST for consumption by our users, greatly enhancing the utility of the MG-RAST service.

摘要

近年来，宏基因组测序产生了大量数据。例如，截至2013年夏季，MG-RAST已被用于注释超过110,000个数据集，总计超过43万亿字节。随着宏基因组测序在科学界得到更广泛的应用，MG-RAST中现有的基于网络的分析工具和基础设施在数据检索和分析方面的能力有限，例如多个数据集之间的比较分析。此外，尽管该系统提供了许多分析工具，但并不全面。通过经由网络服务应用程序编程接口（API）开放MG-RAST，我们极大地扩展了对MG-RAST数据的访问，并提供了一种使用第三方分析工具处理MG-RAST数据的机制。这个RESTful API使MG-RAST管道创建的所有数据和数据对象都可以作为JSON对象访问。作为美国能源部系统生物学知识库项目（KBase，http://kbase.us）的一部分，我们为MG-RAST实现了一个网络服务API。这个API补充了现有的MG-RAST网络界面，并构成了KBase微生物群落功能的基础。此外，该API向程序员公开了全面的数据集合。这个使用RESTful（资源表示状态转移）实现的API与大多数编程环境兼容，并且对于最终用户和第三方来说应该易于使用。它提供了对序列数据、质量控制结果、注释以及许多其他数据类型的全面访问。在可行的情况下，我们使用标准来公开数据和元数据。提供了多种语言的代码示例，以展示API的通用性并为用户提供一个起点。我们展示了一个API，它公开了MG-RAST中的数据以供用户使用，极大地提高了MG-RAST服务的实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ed1/4287624/a363e2745794/pcbi.1004008.g001.jpg

相似文献

A RESTful API for accessing microbial community data for MG-RAST.用于访问MG-RAST微生物群落数据的RESTful API。

PLoS Comput Biol. 2015 Jan 8;11(1):e1004008. doi: 10.1371/journal.pcbi.1004008. eCollection 2015 Jan.

MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function.MG-RAST，一种用于分析微生物群落结构和功能的宏基因组学服务。

Methods Mol Biol. 2016;1399:207-33. doi: 10.1007/978-1-4939-3369-3_13.

Accessing the SEED genome databases via Web services API: tools for programmers.通过 Web 服务 API 访问 SEED 基因组数据库：面向程序员的工具。

BMC Bioinformatics. 2010 Jun 14;11:319. doi: 10.1186/1471-2105-11-319.

The MG-RAST API explorer: an on-ramp for RESTful query composition.MG-RAST API 探索器：用于构建 RESTful 查询的入门工具。

BMC Bioinformatics. 2019 Nov 8;20(1):561. doi: 10.1186/s12859-019-2993-0.

The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes.宏基因组学RAST服务器——用于宏基因组自动系统发育和功能分析的公共资源。

BMC Bioinformatics. 2008 Sep 19;9:386. doi: 10.1186/1471-2105-9-386.

The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).SEED 与利用子系统技术进行快速微生物基因组注释（RAST）。

Nucleic Acids Res. 2014 Jan;42(Database issue):D206-14. doi: 10.1093/nar/gkt1226. Epub 2013 Nov 29.

The Proteins API: accessing key integrated protein and genome information.蛋白质 API：访问关键的综合蛋白质和基因组信息。

Nucleic Acids Res. 2017 Jul 3;45(W1):W539-W544. doi: 10.1093/nar/gkx237.

Annotation Query (AnnoQ): an integrated and interactive platform for large-scale genetic variant annotation.注释查询（AnnoQ）：一个用于大规模遗传变异注释的集成和交互式平台。

Nucleic Acids Res. 2022 Jul 5;50(W1):W57-W65. doi: 10.1093/nar/gkac418.

FirebrowseR: an R client to the Broad Institute's Firehose Pipeline.FirebrowseR：一款用于连接布罗德研究所Firehose管道的R客户端。

Database (Oxford). 2017 Jan 6;2017. doi: 10.1093/database/baw160. Print 2017.

Programmatic access to logical models in the Cell Collective modeling environment via a REST API.通过REST API在细胞集体建模环境中对逻辑模型进行编程访问。

Biosystems. 2016 Jan;139:12-6. doi: 10.1016/j.biosystems.2015.11.005. Epub 2015 Nov 14.

引用本文的文献

MARS and RNAcmap3: The Master Database of All Possible RNA Sequences Integrated with RNAcmap for RNA Homology Search.MARS 和 RNAcmap3：整合了 RNAcmap 的所有可能 RNA 序列的主数据库，用于 RNA 同源性搜索。

Genomics Proteomics Bioinformatics. 2024 May 9;22(1). doi: 10.1093/gpbjnl/qzae018.

The Human Gut and Dietary Salt: The / Ratio as a Potential Marker of Sodium Intake and Beyond.人体肠道与膳食盐：盐/比例作为钠摄入量的潜在标志物及其延伸。

Nutrients. 2024 Mar 25;16(7):942. doi: 10.3390/nu16070942.

The impacts of ocean acidification, warming and their interactive effects on coral prokaryotic symbionts.海洋酸化、变暖及其交互作用对珊瑚原核共生体的影响。

Environ Microbiome. 2023 Jun 7;18(1):49. doi: 10.1186/s40793-023-00505-w.

DIAMOND + MEGAN Microbiome Analysis.DIAMOND + MEGAN 微生物组分析。

Methods Mol Biol. 2023;2649:107-131. doi: 10.1007/978-1-0716-3072-3_6.

Artificial Intelligence: A Promising Tool in Exploring the Phytomicrobiome in Managing Disease and Promoting Plant Health.人工智能：探索植物微生物组以管理疾病和促进植物健康的一种有前景的工具。

Plants (Basel). 2023 Apr 30;12(9):1852. doi: 10.3390/plants12091852.

Abundance and phylogenetic distribution of eight key enzymes of the phosphorus biogeochemical cycle in grassland soils.草原土壤中磷生物地球化学循环的 8 种关键酶的丰度和系统发育分布。

Environ Microbiol Rep. 2023 Oct;15(5):352-369. doi: 10.1111/1758-2229.13159. Epub 2023 May 10.

New-Generation Sequencing Technology in Diagnosis of Fungal Plant Pathogens: A Dream Comes True?新一代测序技术在植物真菌病原体诊断中的应用：梦想成真？

J Fungi (Basel). 2022 Jul 16;8(7):737. doi: 10.3390/jof8070737.

Microbial Dark Matter: from Discovery to Applications.微生物暗物质：从发现到应用。

Genomics Proteomics Bioinformatics. 2022 Oct;20(5):867-881. doi: 10.1016/j.gpb.2022.02.007. Epub 2022 Apr 26.

PREGO: A Literature and Data-Mining Resource to Associate Microorganisms, Biological Processes, and Environment Types.PREGO：一个用于关联微生物、生物过程和环境类型的文献与数据挖掘资源。

Microorganisms. 2022 Jan 26;10(2):293. doi: 10.3390/microorganisms10020293.

Phylosymbiosis in the Rhizosphere Microbiome Extends to Nitrogen Cycle Functional Potential.根际微生物组中的系统共生扩展到氮循环功能潜力。

Microorganisms. 2021 Nov 30;9(12):2476. doi: 10.3390/microorganisms9122476.

本文引用的文献

The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.PhyloFacts FAT-CAT 网络服务器：使用快速近似树分类进行直系同源基因鉴定和功能预测。

Nucleic Acids Res. 2013 Jul;41(Web Server issue):W242-8. doi: 10.1093/nar/gkt399. Epub 2013 May 18.

The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome.生物观测矩阵（BIOM）格式或：我如何学会不再担心并爱上 ome-ome。

Gigascience. 2012 Jul 12;1(1):7. doi: 10.1186/2047-217X-1-7.

The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools.M5nr：一个新颖的非冗余数据库，包含来自多个来源的蛋白质序列和注释以及相关工具。

BMC Bioinformatics. 2012 Jun 21;13:141. doi: 10.1186/1471-2105-13-141.

A platform-independent method for detecting errors in metagenomic sequencing data: DRISEE.一种用于检测宏基因组测序数据中错误的与平台无关的方法：DRISEE。

PLoS Comput Biol. 2012;8(6):e1002541. doi: 10.1371/journal.pcbi.1002541. Epub 2012 Jun 7.

InterPro in 2011: new developments in the family and domain prediction database.InterPro 在 2011 年：家族和域预测数据库的新发展。

Nucleic Acids Res. 2012 Jan;40(Database issue):D306-12. doi: 10.1093/nar/gkr948. Epub 2011 Nov 16.

Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.标记基因序列（MIMARKS）和任何（x）序列（MIxS）规范的最小信息。

Nat Biotechnol. 2011 May;29(5):415-20. doi: 10.1038/nbt.1823.

Accessing the SEED genome databases via Web services API: tools for programmers.通过 Web 服务 API 访问 SEED 基因组数据库：面向程序员的工具。

BMC Bioinformatics. 2010 Jun 14;11:319. doi: 10.1186/1471-2105-11-319.

The 'rare biosphere': a reality check.“稀有生物圈”：现实审视

Nat Methods. 2009 Sep;6(9):636-7. doi: 10.1038/nmeth0909-636.

Identifying protein domains with the Pfam database.使用Pfam数据库鉴定蛋白质结构域。

Curr Protoc Bioinformatics. 2008 Sep;Chapter 2:2.5.1-2.5.17. doi: 10.1002/0471250953.bi0205s23.

BMC Bioinformatics. 2008 Sep 19;9:386. doi: 10.1186/1471-2105-9-386.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于访问MG-RAST微生物群落数据的RESTful API。

A RESTful API for accessing microbial community data for MG-RAST.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献