Suppr超能文献

BD2K-LINCS 数据协调与整合中心的数据和元数据的可持续管理。

Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center.

机构信息

BD2K-LINCS Data Coordination and Integration Center, University of Miami, Miami, FL 33136, USA.

Department of Human Genetics and Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA.

出版信息

Sci Data. 2018 Jun 19;5:180117. doi: 10.1038/sdata.2018.117.

Abstract

The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcriptomics, proteomics, epigenomics, cell phenotype and competitive binding profiling assays. The large volume and variety of data necessitate rigorous data standards and effective data management including modular data processing pipelines and end-user interfaces to facilitate accurate and reliable data exchange, curation, validation, standardization, aggregation, integration, and end user access. Deep metadata annotations and the use of qualified data standards enable integration with many external resources. Here we describe the end-to-end data processing and management at the DCIC to generate a high-quality and persistent product. Our data management and stewardship solutions enable a functioning Consortium and make LINCS a valuable scientific resource that aligns with big data initiatives such as the BD2K NIH Program and concords with emerging data science best practices including the findable, accessible, interoperable, and reusable (FAIR) principles.

摘要

美国国立卫生研究院(NIH)资助的 LINCS 联盟正在创建一个广泛的细胞扰动反应特征参考库,并结合大量扰动剂、模型系统和检测方法,开发了复杂的信息学工具。迄今为止,已经生成了超过 350 个数据集,包括转录组学、蛋白质组学、表观基因组学、细胞表型和竞争性结合分析检测。大量且多样化的数据需要严格的数据标准和有效的数据管理,包括模块化的数据处理管道和终端用户界面,以促进准确可靠的数据交换、管理、验证、标准化、聚合、集成和终端用户访问。深入的元数据注释和合格数据标准的使用,实现了与许多外部资源的集成。在这里,我们描述了 DCIC 的端到端数据处理和管理,以生成高质量且持久的数据产品。我们的数据管理和管理解决方案支持联盟的正常运作,并使 LINCS 成为一个有价值的科学资源,与大数据计划如 NIH 的 BD2K 计划以及新兴的数据科学最佳实践(包括可发现性、可访问性、互操作性和可重用性[FAIR]原则)保持一致。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ffb7/6007090/25897fe07246/sdata2018117-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验