Christley Scott, Aguiar Ademar, Blanck George, Breden Felix, Bukhari Syed Ahmad Chan, Busse Christian E, Jaglale Jerome, Harikrishnan Srilakshmy L, Laserson Uri, Peters Bjoern, Rocha Artur, Schramm Chaim A, Taylor Sarah, Vander Heiden Jason Anthony, Zimonja Bojan, Watson Corey T, Corrie Brian, Cowell Lindsay G
Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX, United States.
Centre for Information Systems and Computer Graphics, Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal.
Front Big Data. 2020 Jun 17;3:22. doi: 10.3389/fdata.2020.00022. eCollection 2020.
The Adaptive Immune Receptor Repertoire (AIRR) Community is a research-driven group that is establishing a clear set of community-accepted data and metadata standards; standards-based reference implementation tools; and policies and practices for infrastructure to support the deposit, curation, storage, and use of high-throughput sequencing data from B-cell and T-cell receptor repertoires (AIRR-seq data). The AIRR Data Commons is a distributed system of data repositories that utilizes a common data model, a common query language, and common interoperability formats for storage, query, and downloading of AIRR-seq data. Here is described the principal technical standards for the AIRR Data Commons consisting of the AIRR Data Model for repertoires and rearrangements, the AIRR Data Commons (ADC) API for programmatic query of data repositories, a reference implementation for ADC API services, and tools for querying and validating data repositories that support the ADC API. AIRR-seq data repositories can become part of the AIRR Data Commons by implementing the data model and API. The AIRR Data Commons allows AIRR-seq data to be reused for novel analyses and empowers researchers to discover new biological insights about the adaptive immune system.
适应性免疫受体组库(AIRR)社区是一个由研究驱动的团体,正在建立一套明确的、被社区认可的数据和元数据标准;基于标准的参考实现工具;以及关于基础设施的政策和实践,以支持来自B细胞和T细胞受体组库的高通量测序数据(AIRR-seq数据)的存储、管理、储存和使用。AIRR数据共享库是一个分布式数据存储系统,它利用通用数据模型、通用查询语言以及通用互操作性格式来存储、查询和下载AIRR-seq数据。本文介绍了AIRR数据共享库的主要技术标准,包括用于组库和重排的AIRR数据模型、用于对数据存储库进行编程查询的AIRR数据共享库(ADC)应用程序编程接口(API)、ADC API服务的参考实现,以及用于查询和验证支持ADC API的数据存储库的工具。通过实施数据模型和API,AIRR-seq数据存储库可以成为AIRR数据共享库的一部分。AIRR数据共享库允许AIRR-seq数据被重新用于新的分析,并使研究人员能够发现关于适应性免疫系统的新生物学见解。