Center for Biomedical Informatics and Information Technology, NCI, Rockville, Maryland.
Velsera (Seven Bridges), Charlestown, Massachusetts.
Cancer Res. 2024 May 2;84(9):1404-1409. doi: 10.1158/0008-5472.CAN-23-2730.
More than ever, scientific progress in cancer research hinges on our ability to combine datasets and extract meaningful interpretations to better understand diseases and ultimately inform the development of better treatments and diagnostic tools. To enable the successful sharing and use of big data, the NCI developed the Cancer Research Data Commons (CRDC), providing access to a large, comprehensive, and expanding collection of cancer data. The CRDC is a cloud-based data science infrastructure that eliminates the need for researchers to download and store large-scale datasets by allowing them to perform analysis where data reside. Over the past 10 years, the CRDC has made significant progress in providing access to data and tools along with training and outreach to support the cancer research community. In this review, we provide an overview of the history and the impact of the CRDC to date, lessons learned, and future plans to further promote data sharing, accessibility, interoperability, and reuse. See related articles by Brady et al., p. 1384, Wang et al., p. 1388, and Pot et al., p. 1396.
在癌症研究中,科学进步比以往任何时候都更加依赖于我们整合数据集并从中提取有意义的解释,以更好地了解疾病,最终为更好的治疗和诊断工具的开发提供信息。为了能够成功地共享和使用大数据,NCI 开发了癌症研究数据共享中心(CRDC),提供了对大型、综合和不断扩展的癌症数据集的访问。CRDC 是一个基于云的数据科学基础设施,通过允许研究人员在数据所在的地方进行分析,消除了他们下载和存储大规模数据集的需求。在过去的 10 年中,CRDC 在提供数据和工具访问以及培训和外展以支持癌症研究社区方面取得了重大进展。在这篇综述中,我们概述了 CRDC 的历史和迄今为止的影响、经验教训以及未来计划,以进一步促进数据共享、可访问性、互操作性和重用。请参阅 Brady 等人、Wang 等人和 Pot 等人的相关文章,分别位于第 1384 页、第 1388 页和第 1396 页。