• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全球基因组与健康联盟与 Bioconductor 会面:致力于在云计算规模上实现可重复和灵活的癌症基因组学。

Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale.

机构信息

Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.

Graduate School of Public Health and Health Policy, City University of New York, New York, NY.

出版信息

JCO Clin Cancer Inform. 2020 May;4:472-479. doi: 10.1200/CCI.19.00111.

DOI:10.1200/CCI.19.00111
PMID:32453635
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7265787/
Abstract

PURPOSE

Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor.

METHODS

We describe in detail a use case in pancancer transcriptomics conducted by blending implementations of the GA4GH Workflow Execution Services and Tool Registry Service concepts with the Bioconductor curatedTCGAData and BiocOncoTK packages.

RESULTS

We carried out the analysis with a formally archived workflow and container at dockstore.org and a workspace and notebook at app.terra.bio. The analysis identified relationships between microsatellite instability and biomarkers of immune dysregulation at a finer level of granularity than previously reported. Our use of standard approaches to containerization and workflow programming allows this analysis to be replicated and extended.

CONCLUSION

Experimental use of dockstore.org and app.terra.bio in concert with Bioconductor enabled novel statistical analysis of large genomic projects without the need for local supercomputing resources but involved challenges related to container design, script archiving, and unit testing. Best practices and cost/benefit metrics for the management and analysis of globally federated genomic data and annotation are evolving. The creation and execution of use cases like the one reported here will be helpful in the development and comparison of approaches to federated data/analysis systems in cancer genomics.

摘要

目的

机构正在努力推动癌症基因组学数据和分析方法的民主化,使其在云计算规模上得以应用。在此过程中,我们整合了两个主要的生物信息学计划:基因组和健康全球联盟(GA4GH)和 Bioconductor。

方法

我们详细描述了一个泛癌转录组学的用例,该用例通过融合 GA4GH 工作流执行服务和工具注册服务概念的实现,以及 Bioconductor curatedTCGAData 和 BiocOncoTK 包,来完成分析。

结果

我们在 dockstore.org 上使用正式归档的工作流和容器,以及在 app.terra.bio 上使用工作区和笔记本进行了分析。分析结果在比以前报告的更细的粒度上确定了微卫星不稳定性与免疫失调生物标志物之间的关系。我们使用标准的容器化和工作流编程方法来进行分析,这使得分析可以被复制和扩展。

结论

在 Bioconductor 的协同作用下,实验性地使用 dockstore.org 和 app.terra.bio,使得无需本地超级计算资源即可对大型基因组项目进行新的统计分析,但涉及到容器设计、脚本归档和单元测试等方面的挑战。全球联邦基因组数据和注释的管理和分析的最佳实践和成本/效益指标正在不断发展。创建和执行像这里报告的这样的用例,将有助于开发和比较癌症基因组学中联邦数据/分析系统的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/cfc742299920/CCI.19.00111f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/56af5709daaa/CCI.19.00111f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/49b1fcb074b1/CCI.19.00111f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/989e7251e8ae/CCI.19.00111f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/aa23f3383194/CCI.19.00111f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/cfc742299920/CCI.19.00111f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/56af5709daaa/CCI.19.00111f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/49b1fcb074b1/CCI.19.00111f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/989e7251e8ae/CCI.19.00111f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/aa23f3383194/CCI.19.00111f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/509d/7265787/cfc742299920/CCI.19.00111f5.jpg

相似文献

1
Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale.全球基因组与健康联盟与 Bioconductor 会面:致力于在云计算规模上实现可重复和灵活的癌症基因组学。
JCO Clin Cancer Inform. 2020 May;4:472-479. doi: 10.1200/CCI.19.00111.
2
The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.码头仓库:实现基于Docker的基因组学工具和工作流程的模块化、以社区为中心的共享。
F1000Res. 2017 Jan 18;6:52. doi: 10.12688/f1000research.10137.1. eCollection 2017.
3
ReUseData: an R/Bioconductor tool for reusable and reproducible genomic data management.ReUseData:一个用于可重复使用和可重现的基因组数据管理的 R/Bioconductor 工具。
BMC Bioinformatics. 2024 Jan 3;25(1):8. doi: 10.1186/s12859-023-05626-0.
4
Genomics Virtual Laboratory: A Practical Bioinformatics Workbench for the Cloud.基因组学虚拟实验室:面向云端的实用生物信息学工作台。
PLoS One. 2015 Oct 26;10(10):e0140829. doi: 10.1371/journal.pone.0140829. eCollection 2015.
5
Scalable Workflows and Reproducible Data Analysis for Genomics.基因组学的可扩展工作流程和可重复数据分析
Methods Mol Biol. 2019;1910:723-745. doi: 10.1007/978-1-4939-9074-0_24.
6
Using R and Bioconductor in Clinical Genomics and Transcriptomics.使用 R 和 Bioconductor 进行临床基因组学和转录组学研究。
J Mol Diagn. 2020 Jan;22(1):3-20. doi: 10.1016/j.jmoldx.2019.08.006. Epub 2019 Oct 9.
7
Accumulating computational resource usage of genomic data analysis workflow to optimize cloud computing instance selection.积累基因组数据分析工作流程的计算资源使用情况,以优化云计算实例选择。
Gigascience. 2019 Apr 1;8(4). doi: 10.1093/gigascience/giz052.
8
systemPipeR: NGS workflow and report generation environment.systemPipeR:二代测序工作流程与报告生成环境。
BMC Bioinformatics. 2016 Sep 20;17:388. doi: 10.1186/s12859-016-1241-0.
9
Multiomic Integration of Public Oncology Databases in Bioconductor.公共肿瘤学数据库的 Bioconductor 多组学整合。
JCO Clin Cancer Inform. 2020 Oct;4:958-971. doi: 10.1200/CCI.19.00119.
10
The Cancer Genomics Cloud: Collaborative, Reproducible, and Democratized-A New Paradigm in Large-Scale Computational Research.癌症基因组学云:协作、可重复且民主化——大规模计算研究的新范式
Cancer Res. 2017 Nov 1;77(21):e3-e6. doi: 10.1158/0008-5472.CAN-17-0387.

引用本文的文献

1
Identification of novel variants for complicating cardiac disease in the scrub typhus infection using whole genome sequencing.利用全基因组测序鉴定恙虫病感染并发心脏疾病的新型变异。
Korean J Intern Med. 2023 Nov;38(6):865-871. doi: 10.3904/kjim.2023.221. Epub 2023 Nov 1.
2
Identification of Novel Characteristics in TP53-Mutant Hepatocellular Carcinoma Using Bioinformatics.利用生物信息学鉴定TP53突变型肝细胞癌的新特征
Front Genet. 2022 May 16;13:874805. doi: 10.3389/fgene.2022.874805. eCollection 2022.
3
Practical Aspects of Implementing and Applying Health Care Cloud Computing Services and Informatics to Cancer Clinical Trial Data.

本文引用的文献

1
Data Lakes, Clouds, and Commons: A Review of Platforms for Analyzing and Sharing Genomic Data.数据湖、云与公共数据池:基因组数据分析与共享平台综述
Trends Genet. 2019 Mar;35(3):223-234. doi: 10.1016/j.tig.2018.12.006. Epub 2019 Jan 25.
2
Comprehensive Characterization of Cancer Driver Genes and Mutations.癌症驱动基因与突变的全面表征
Cell. 2018 Aug 9;174(4):1034-1035. doi: 10.1016/j.cell.2018.07.034.
3
Perspective on Oncogenic Processes at the End of the Beginning of Cancer Genomics.癌症基因组学开端之际致癌过程的透视
实施和应用医疗保健云计算服务和信息学在癌症临床试验数据方面的实际问题。
JCO Clin Cancer Inform. 2021 Aug;5:826-832. doi: 10.1200/CCI.21.00018.
4
Informatics Tools for Cancer Research and Care: Bridging the Gap Between Innovation and Implementation.癌症研究与护理的信息学工具:弥合创新与实施之间的差距
JCO Clin Cancer Inform. 2020 Sep;4:784-786. doi: 10.1200/CCI.20.00086.
Cell. 2018 Apr 5;173(2):305-320.e10. doi: 10.1016/j.cell.2018.03.033.
4
Software for the Integration of Multiomics Experiments in Bioconductor.用于在生物导体中整合多组学实验的软件。
Cancer Res. 2017 Nov 1;77(21):e39-e42. doi: 10.1158/0008-5472.CAN-17-0344.
5
The Dockstore: enabling modular, community-focused sharing of Docker-based genomics tools and workflows.码头仓库:实现基于Docker的基因组学工具和工作流程的模块化、以社区为中心的共享。
F1000Res. 2017 Jan 18;6:52. doi: 10.12688/f1000research.10137.1. eCollection 2017.
6
The prognostic landscape of genes and infiltrating immune cells across human cancers.人类癌症中基因与浸润性免疫细胞的预后情况
Nat Med. 2015 Aug;21(8):938-945. doi: 10.1038/nm.3909. Epub 2015 Jul 20.
7
Orchestrating high-throughput genomic analysis with Bioconductor.使用Bioconductor编排高通量基因组分析。
Nat Methods. 2015 Feb;12(2):115-21. doi: 10.1038/nmeth.3252.
8
MSIsensor: microsatellite instability detection using paired tumor-normal sequence data.MSIsensor:使用配对的肿瘤-正常序列数据进行微卫星不稳定性检测。
Bioinformatics. 2014 Apr 1;30(7):1015-6. doi: 10.1093/bioinformatics/btt755. Epub 2013 Dec 25.
9
RNA-Seq gene expression estimation with read mapping uncertainty.基于读段比对不确定性的 RNA-Seq 基因表达估计。
Bioinformatics. 2010 Feb 15;26(4):493-500. doi: 10.1093/bioinformatics/btp692. Epub 2009 Dec 18.
10
More powerful procedures for multiple significance testing.用于多重显著性检验的更强大方法。
Stat Med. 1990 Jul;9(7):811-8. doi: 10.1002/sim.4780090710.