Suppr超能文献

癌症研究中的大数据综合基础设施:加速癌症研究与精准医学

A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine.

作者信息

Hinkson Izumi V, Davidsen Tanja M, Klemm Juli D, Kerlavage Anthony R, Kibbe Warren A, Chandramouliswaran Ishwar

机构信息

Center for Biomedical Informatics and Information Technology, National Cancer InstituteRockville, MD, United States.

Science and Technology Policy Fellowship Program, American Association for the Advancement of ScienceWashington, DC, United States.

出版信息

Front Cell Dev Biol. 2017 Sep 21;5:83. doi: 10.3389/fcell.2017.00083. eCollection 2017.

Abstract

Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.

摘要

下一代测序技术和其他“组学”技术的进步正在加速对个体患者肿瘤进行详细的分子特征分析,并推动精准医学的发展。癌症不再被视为单一疾病,而是一系列不同的疾病,其中每个患者都有独特的种系变异和体细胞突变组合。对患者来源样本进行分子分析导致了数据爆炸式增长,这有助于我们了解环境和种系对风险、治疗反应及预后的影响。为了最大化这些数据的价值,跨学科方法至关重要。美国国立癌症研究所(NCI)已启动多个项目,使用多组学方法对肿瘤样本进行特征分析。这些项目利用临床医生、生物学家、计算机科学家和软件工程师的专业知识,在多学科团队中研究癌症生物学和治疗反应。这些项目已经生成了PB级的癌症基因组、转录组、表观基因组、蛋白质组和成像数据。为应对与这些大型数据集相关的数据分析挑战,NCI资助了基因组数据共享库(GDC)和三个云资源的开发。GDC确保数据和元数据质量,摄取并整合基因组数据,并安全地重新分发数据。在试点阶段,云资源测试了多种基于云的方法,以增强数据访问、协作、计算可扩展性、资源普及性和可重复性。这些由NCI牵头的工作正在不断完善,以更好地支持开放数据实践和精准肿瘤学,并成为NCI癌症研究数据共享库的组成部分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/835a/5613113/fe2981e3ac2d/fcell-05-00083-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验