• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

与美国国立人类基因组研究所(NHGRI)基因组数据科学分析、可视化和信息学实验室空间一起颠覆基因组学数据共享模式。

Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space.

作者信息

Schatz Michael C, Philippakis Anthony A, Afgan Enis, Banks Eric, Carey Vincent J, Carroll Robert J, Culotti Alessandro, Ellrott Kyle, Goecks Jeremy, Grossman Robert L, Hall Ira M, Hansen Kasper D, Lawson Jonathan, Leek Jeffrey T, Luria Anne O'Donnell, Mosher Stephen, Morgan Martin, Nekrutenko Anton, O'Connor Brian D, Osborn Kevin, Paten Benedict, Patterson Candace, Tan Frederick J, Taylor Casey Overby, Vessio Jennifer, Waldron Levi, Wang Ting, Wuichet Kristin

机构信息

Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

出版信息

Cell Genom. 2022 Jan 12;2(1). doi: 10.1016/j.xgen.2021.100085. Epub 2022 Jan 13.

DOI:10.1016/j.xgen.2021.100085
PMID:35199087
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8863334/
Abstract

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https://anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform designed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics datasets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.

摘要

美国国立人类基因组研究所(NHGRI)的基因组数据科学分析、可视化和信息学实验室空间(AnVIL;https://anvilproject.org)的开发是为了满足科学界对基因组数据存储、管理和分析统一计算环境的广泛需求。在这篇观点文章中,我们介绍AnVIL,描述其生态系统以及与其他平台的互操作性,并强调该平台及相关举措如何有助于改进基因组数据共享工作。AnVIL是一个联邦云平台,旨在管理和存储基因组及相关数据,实现群体规模分析,并通过数据、代码和分析结果的共享促进合作。通过颠覆传统的数据共享模式,AnVIL消除了数据移动的需求,同时还增加了主动威胁检测和监控的安全措施,并为任何研究人员提供可扩展的共享计算资源。我们描述了AnVIL的核心数据管理和分析组件,目前包括Terra、Gen3、Galaxy、RStudio/Bioconductor、Dockstore和Jupyter,并介绍了AnVIL中可用的几个旗舰基因组数据集。我们通过实现新功能,包括互操作性和负责任的数据共享机制,同时简化访问管理,继续扩展和创新AnVIL生态系统。AnVIL为分析、合作和数据共享带来了许多新机会,这些对于推动研究以及通过对数十万至数百万个基因组以及相关临床和分子数据类型的联合分析来做出发现是必要的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/d16ba02594bc/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/7622641d7af1/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/16fcc05180fb/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/d16ba02594bc/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/7622641d7af1/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/16fcc05180fb/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ad4/9903656/d16ba02594bc/gr2.jpg

相似文献

1
Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space.与美国国立人类基因组研究所(NHGRI)基因组数据科学分析、可视化和信息学实验室空间一起颠覆基因组学数据共享模式。
Cell Genom. 2022 Jan 12;2(1). doi: 10.1016/j.xgen.2021.100085. Epub 2022 Jan 13.
2
AnVILWorkflow: A runnable workflow package for Cloud-implemented bioinformatics analysis pipelines.AnVIL工作流:一个用于云实现的生物信息学分析管道的可运行工作流包。
Res Sq. 2024 May 15:rs.3.rs-4370115. doi: 10.21203/rs.3.rs-4370115/v1.
3
Cloud-based biomedical data storage and analysis for genomic research: Landscape analysis of data governance in emerging NIH-supported platforms.基于云的生物医学数据存储和分析在基因组研究中的应用:新兴 NIH 支持平台中数据治理的景观分析。
HGG Adv. 2023 Apr 12;4(3):100196. doi: 10.1016/j.xhgg.2023.100196. eCollection 2023 Jul 13.
4
The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.码头仓库:实现基于Docker的基因组学工具和工作流程的模块化、以社区为中心的共享。
F1000Res. 2017 Jan 18;6:52. doi: 10.12688/f1000research.10137.1. eCollection 2017.
5
Democratizing clinical-genomic data: How federated platforms can promote benefits sharing in genomics.临床基因组数据的民主化:联合平台如何促进基因组学中的利益共享。
Front Genet. 2023 Jan 10;13:1045450. doi: 10.3389/fgene.2022.1045450. eCollection 2022.
6
Cloud bursting galaxy: federated identity and access management.云爆发星系:联合身份与访问管理。
Bioinformatics. 2020 Jan 1;36(1):1-9. doi: 10.1093/bioinformatics/btz472.
7
Centers for Mendelian Genomics: A decade of facilitating gene discovery.孟德尔基因组医学中心:十年来推动基因发现。
Genet Med. 2022 Apr;24(4):784-797. doi: 10.1016/j.gim.2021.12.005. Epub 2022 Feb 9.
8
Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale.全球基因组与健康联盟与 Bioconductor 会面:致力于在云计算规模上实现可重复和灵活的癌症基因组学。
JCO Clin Cancer Inform. 2020 May;4:472-479. doi: 10.1200/CCI.19.00111.
9
St. Jude Cloud: A Pediatric Cancer Genomic Data-Sharing Ecosystem.圣裘德云:儿科癌症基因组数据共享生态系统。
Cancer Discov. 2021 May;11(5):1082-1099. doi: 10.1158/2159-8290.CD-20-1230. Epub 2021 Jan 6.
10
Computing patient data in the cloud: practical and legal considerations for genetics and genomics research in Europe and internationally.在云端计算患者数据:欧洲及国际范围内遗传学和基因组学研究的实际与法律考量
Genome Med. 2017 Jun 20;9(1):58. doi: 10.1186/s13073-017-0449-6.

引用本文的文献

1
A data model for population descriptors in genomic research.基因组研究中群体描述符的数据模型。
Am J Hum Genet. 2025 Jul 3;112(7):1504-1514. doi: 10.1016/j.ajhg.2025.05.011. Epub 2025 Jun 12.
2
South Asians and cardiometabolic health: A framework for comprehensive care for the individual, community, and population - An American society for preventive cardiology clinical practice statement.南亚人与心脏代谢健康:针对个体、社区和人群的综合护理框架——美国预防心脏病学会临床实践声明
Am J Prev Cardiol. 2025 Apr 22;22:101000. doi: 10.1016/j.ajpc.2025.101000. eCollection 2025 Jun.
3
Sustainability in translational genomics research with undiagnosed patients: What is it, why do we need it, and how do we do it?

本文引用的文献

1
Empirical validation of an automated approach to data use oversight.数据使用监督自动化方法的实证验证。
Cell Genom. 2021 Nov 10;1(2):100031. doi: 10.1016/j.xgen.2021.100031.
2
Genomics for all: Open, collaborative, pioneering.全民基因组学:开放、协作、开拓。
Cell Genom. 2021 Oct 13;1(1):100008. doi: 10.1016/j.xgen.2021.100008.
3
Workshop proceedings: GWAS summary statistics standards and sharing.研讨会会议记录:全基因组关联研究汇总统计标准与共享
针对未确诊患者的转化基因组学研究中的可持续性:它是什么,我们为何需要它,以及我们如何实现它?
Genet Med. 2025 Aug;27(8):101458. doi: 10.1016/j.gim.2025.101458. Epub 2025 May 21.
4
: A cloud-based genomics infrastructure with variant-calling pipeline suited for population-scale sequencing projects.一个基于云的基因组学基础设施,带有适用于群体规模测序项目的变异检测流程。
medRxiv. 2025 Apr 30:2025.04.29.25326690. doi: 10.1101/2025.04.29.25326690.
5
Data Interoperability and Harmonization in Cardiovascular Genomic and Precision Medicine.心血管基因组学与精准医学中的数据互操作性与协调统一
Circ Genom Precis Med. 2025 Jun;18(3):e004624. doi: 10.1161/CIRCGEN.124.004624. Epub 2025 May 9.
6
From Spreadsheets and Bespoke Models to Enterprise Data Warehouses: GPT-enabled Clinical Data Ingestion into i2b2.从电子表格和定制模型到企业数据仓库:将由GPT驱动的临床数据摄入i2b2。
medRxiv. 2025 Apr 19:2025.04.17.25325962. doi: 10.1101/2025.04.17.25325962.
7
The Farm Animal Genotype-Tissue Expression (FarmGTEx) Project.农场动物基因型-组织表达(FarmGTEx)项目
Nat Genet. 2025 Apr;57(4):786-796. doi: 10.1038/s41588-025-02121-5. Epub 2025 Mar 17.
8
Harmonizing and integrating the NCI Genomic Data Commons through accessible, interactive, and cloud-enabled workflows.通过可访问、交互式且支持云的工作流程,协调和整合美国国立癌症研究所基因组数据共享库。
PLoS One. 2025 Mar 4;20(3):e0318676. doi: 10.1371/journal.pone.0318676. eCollection 2025.
9
Science and Society: Pathways to Equitable Access and Delivery of Genomics Medicine in Africa.科学与社会:非洲实现基因组医学公平获取与提供的途径
Curr Genet Med Rep. 2025;13(1):1. doi: 10.1007/s40142-024-00211-0. Epub 2025 Feb 24.
10
Data Sharing in the PRIMED Consortium: Design, implementation, and recommendations for future policymaking.PRIMED联盟中的数据共享:设计、实施及对未来政策制定的建议
ArXiv. 2025 Feb 12:arXiv:2502.09351v1.
Cell Genom. 2021 Oct 13;1(1). doi: 10.1016/j.xgen.2021.100004.
4
High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.对扩展的 1000 基因组项目队列进行高覆盖率全基因组测序,包括 602 个三核苷酸重复序列。
Cell. 2022 Sep 1;185(18):3426-3440.e19. doi: 10.1016/j.cell.2022.08.004.
5
A complete reference genome improves analysis of human genetic variation.完整的参考基因组提高了人类遗传变异分析的能力。
Science. 2022 Apr;376(6588):eabl3533. doi: 10.1126/science.abl3533. Epub 2022 Apr 1.
6
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
7
International federation of genomic medicine databases using GA4GH standards.使用全球基因组与健康联盟(GA4GH)标准的国际基因组医学数据库联合会。
Cell Genom. 2021 Nov 10;1(2). doi: 10.1016/j.xgen.2021.100032.
8
GA4GH: International policies and standards for data sharing across genomic research and healthcare.全球基因组与健康联盟(GA4GH):跨基因组研究与医疗保健领域数据共享的国际政策与标准。
Cell Genom. 2021 Nov 10;1(2). doi: 10.1016/j.xgen.2021.100029.
9
GA4GH Passport standard for digital identity and access permissions.GA4GH数字身份和访问权限的护照标准。
Cell Genom. 2021 Nov 10;1(2):None. doi: 10.1016/j.xgen.2021.100030.
10
The Data Use Ontology to streamline responsible access to human biomedical datasets.数据使用本体论,以简化对人类生物医学数据集的负责任访问。
Cell Genom. 2021 Nov 10;1(2):None. doi: 10.1016/j.xgen.2021.100028.