SnoVault和encodeD：一种新型的基于对象的存储系统及其在ENCODE元数据中的应用。

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

作者信息

Hitz Benjamin C, Rowe Laurence D, Podduturi Nikhil R, Glick David I, Baymuradov Ulugbek K, Malladi Venkat S, Chan Esther T, Davidson Jean M, Gabdank Idan, Narayana Aditi K, Onate Kathrina C, Hilton Jason, Ho Marcus C, Lee Brian T, Miyasato Stuart R, Dreszer Timothy R, Sloan Cricket A, Strattan J Seth, Tanaka Forrest Y, Hong Eurie L, Cherry J Michael

机构信息

Stanford University School of Medicine, Department of Genetics, Stanford, California, United States of America.

University of California Santa Cruz, Baskin School of Engineering, Center for Biomolecular Science and Engineering, Santa Cruz, California, United States of America.

出版信息

PLoS One. 2017 Apr 12;12(4):e0175310. doi: 10.1371/journal.pone.0175310. eCollection 2017.

DOI:10.1371/journal.pone.0175310

PMID:28403240

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5389787/

Abstract

The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.

摘要

DNA 元件百科全书（ENCODE）计划是一项正在进行的合作项目，旨在创建一份功能元件的综合目录，该项目在人类基因组计划完成后不久就启动了。当前的数据库包含超过450种细胞系和组织的6500多个实验，使用了广泛的实验技术来研究智人和小家鼠基因组的染色质结构、调控和转录图谱。所有ENCODE实验数据、元数据以及相关的计算分析都提交给ENCODE数据协调中心（DCC）进行验证、跟踪、存储、统一处理，并分发给社区资源和科学界。随着数据量的增加，实验细节的识别和组织变得越来越复杂，需要仔细管理。ENCODE DCC创建了一个通用软件系统，称为SnoVault，它支持元数据和文件提交、用于存储元数据的数据库、用于显示元数据的网页以及用于查询元数据的强大API。该软件是完全开源的，代码和安装说明可在以下网址找到：http://github.com/ENCODE-DCC/snovault/（用于通用数据库）以及http://github.com/ENCODE-DCC/encoded/，用于以ENCODE的方式存储基因组数据。核心数据库引擎SnoVault（它完全独立于ENCODE、基因组数据或生物信息数据）已作为一个单独的Python包发布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0522/5389787/d9fb8467157d/pone.0175310.g001.jpg

相似文献

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.SnoVault和encodeD：一种新型的基于对象的存储系统及其在ENCODE元数据中的应用。

PLoS One. 2017 Apr 12;12(4):e0175310. doi: 10.1371/journal.pone.0175310. eCollection 2017.

Ontology application and use at the ENCODE DCC.本体在ENCODE数据协调中心的应用与使用。

Database (Oxford). 2015 Mar 16;2015. doi: 10.1093/database/bav010. Print 2015.

The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程

bioRxiv. 2023 Apr 6:2023.04.04.535623. doi: 10.1101/2023.04.04.535623.

The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程

Res Sq. 2023 Jul 19:rs.3.rs-3111932. doi: 10.21203/rs.3.rs-3111932/v1.

Principles of metadata organization at the ENCODE data coordination center.ENCODE数据协调中心的元数据组织原则。

Database (Oxford). 2016 Mar 15;2016. doi: 10.1093/database/baw001. Print 2016.

The ENCODE Portal as an Epigenomics Resource.作为表观基因组学资源的ENCODE门户。

Curr Protoc Bioinformatics. 2019 Dec;68(1):e89. doi: 10.1002/cpbi.89.

The modENCODE Data Coordination Center: lessons in harvesting comprehensive experimental details.modENCODE 数据协调中心：全面实验细节采集的经验教训。

Database (Oxford). 2011 Aug 19;2011:bar023. doi: 10.1093/database/bar023. Print 2011.

The Encyclopedia of DNA elements (ENCODE): data portal update.《DNA 元件百科全书》（ENCODE）：数据门户更新。

Nucleic Acids Res. 2018 Jan 4;46(D1):D794-D801. doi: 10.1093/nar/gkx1081.

New developments on the Encyclopedia of DNA Elements (ENCODE) data portal.DNA 元件百科全书（ENCODE）数据门户的新进展。

Nucleic Acids Res. 2020 Jan 8;48(D1):D882-D889. doi: 10.1093/nar/gkz1062.

ENCODE whole-genome data in the UCSC genome browser (2011 update).将ENCODE全基因组数据编码到UCSC基因组浏览器中（2011年更新版）。

Nucleic Acids Res. 2011 Jan;39(Database issue):D871-5. doi: 10.1093/nar/gkq1017. Epub 2010 Oct 30.

引用本文的文献

Diminished immune cell adhesion in hypoimmune ICAM-1 knockout human pluripotent stem cells.低免疫性细胞间黏附分子-1（ICAM-1）基因敲除的人多能干细胞中免疫细胞黏附能力降低。

Nat Commun. 2025 Aug 12;16(1):7415. doi: 10.1038/s41467-025-62568-2.

Altered metabolomics and inflammatory transcriptomics in human bone marrow adipocytes after acute high calorie diet and acute fasting.急性高热量饮食和急性禁食后人类骨髓脂肪细胞的代谢组学和炎症转录组学变化

Front Endocrinol (Lausanne). 2025 Jun 16;16:1591280. doi: 10.3389/fendo.2025.1591280. eCollection 2025.

Diminished Immune Cell Adhesion in Hypoimmune ICAM-1 Knockout Pluripotent Stem Cells.低免疫ICAM - 1基因敲除多能干细胞中免疫细胞黏附的减弱

bioRxiv. 2024 Jun 9:2024.06.07.597791. doi: 10.1101/2024.06.07.597791.

The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程

Res Sq. 2023 Jul 19:rs.3.rs-3111932. doi: 10.21203/rs.3.rs-3111932/v1.

The ENCODE Uniform Analysis Pipelines.ENCODE统一分析流程

bioRxiv. 2023 Apr 6:2023.04.04.535623. doi: 10.1101/2023.04.04.535623.

The 4D Nucleome Data Portal as a resource for searching and visualizing curated nucleomics data.4D 核组学数据门户，用作搜索和可视化已策核组学数据的资源。

Nat Commun. 2022 May 2;13(1):2365. doi: 10.1038/s41467-022-29697-4.

The ENCODE Portal as an Epigenomics Resource.作为表观基因组学资源的ENCODE门户。

Curr Protoc Bioinformatics. 2019 Dec;68(1):e89. doi: 10.1002/cpbi.89.

New developments on the Encyclopedia of DNA Elements (ENCODE) data portal.DNA 元件百科全书（ENCODE）数据门户的新进展。

Nucleic Acids Res. 2020 Jan 8;48(D1):D882-D889. doi: 10.1093/nar/gkz1062.

Laser Capture Microdissection and RNA-Seq Analysis: High Sensitivity Approaches to Explain Histopathological Heterogeneity in Human Glioblastoma FFPE Archived Tissues.激光捕获显微切割与RNA测序分析：解释人胶质母细胞瘤FFPE存档组织中组织病理学异质性的高灵敏度方法。

Front Oncol. 2019 Jun 7;9:482. doi: 10.3389/fonc.2019.00482. eCollection 2019.

: A Tool for Searching Putative Factors Regulating Gene Expression Using ChIP-seq Data.: 一种使用 ChIP-seq 数据搜索调控基因表达的潜在因子的工具。

Int J Biol Sci. 2018 Sep 7;14(12):1724-1731. doi: 10.7150/ijbs.28850. eCollection 2018.

本文引用的文献

Principles of metadata organization at the ENCODE data coordination center.ENCODE数据协调中心的元数据组织原则。

Database (Oxford). 2016 Mar 15;2016. doi: 10.1093/database/baw001. Print 2016.

Deciphering ENCODE.解读 ENCODE。

Trends Genet. 2016 Apr;32(4):238-249. doi: 10.1016/j.tig.2016.02.002. Epub 2016 Mar 5.

Genomic footprinting.基因组足迹分析。

Nat Methods. 2016 Mar;13(3):213-21. doi: 10.1038/nmeth.3768.

The UCSC Genome Browser database: 2016 update.加州大学圣克鲁兹分校基因组浏览器数据库：2016年更新

Nucleic Acids Res. 2016 Jan 4;44(D1):D717-25. doi: 10.1093/nar/gkv1275. Epub 2015 Nov 20.

Differential expression analysis of human endogenous retroviruses based on ENCODE RNA-seq data.基于ENCODE RNA测序数据的人类内源性逆转录病毒差异表达分析

BMC Med Genomics. 2015 Nov 3;8:71. doi: 10.1186/s12920-015-0146-5.

ENCODE data at the ENCODE portal.ENCODE门户中的ENCODE数据。

Nucleic Acids Res. 2016 Jan 4;44(D1):D726-32. doi: 10.1093/nar/gkv1160. Epub 2015 Nov 2.

The effects of omega-3 polyunsaturated fatty acids and genetic variants on methylation levels of the interleukin-6 gene promoter.ω-3多不饱和脂肪酸和基因变异对白介素-6基因启动子甲基化水平的影响。

Mol Nutr Food Res. 2016 Feb;60(2):410-9. doi: 10.1002/mnfr.201500436. Epub 2015 Nov 23.

Insights from ENCODE on Missing Proteins: Why β-Defensin Expression Is Scarcely Detected.ENCODE对缺失蛋白质的见解：为何β-防御素表达几乎检测不到。

J Proteome Res. 2015 Sep 4;14(9):3635-44. doi: 10.1021/acs.jproteome.5b00565. Epub 2015 Aug 18.

Lessons from modENCODE.来自modENCODE的经验教训。

Annu Rev Genomics Hum Genet. 2015;16:31-53. doi: 10.1146/annurev-genom-090413-025448. Epub 2015 Jun 26.

Ontology application and use at the ENCODE DCC.本体在ENCODE数据协调中心的应用与使用。

Database (Oxford). 2015 Mar 16;2015. doi: 10.1093/database/bav010. Print 2015.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

SnoVault和encodeD：一种新型的基于对象的存储系统及其在ENCODE元数据中的应用。

SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献