Samur Mehmet Kemal
Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard School of Public Health, Boston, Massachusetts, United States of America; Lebow Institute of Myeloma Therapeutics and Jerome Lipper Multiple Myeloma Center, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, United States of America.
PLoS One. 2014 Sep 2;9(9):e106397. doi: 10.1371/journal.pone.0106397. eCollection 2014.
BACKGROUND & OBJECTIVE: Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent steps. We have developed an open source and extensible R based data client for pre-processed data from the Firehouse, and demonstrate its use with sample case studies. Results show that our RTCGAToolbox can facilitate data management for researchers interested in working with TCGA data. The RTCGAToolbox can also be integrated with other analysis pipelines for further data processing.
The RTCGAToolbox is open-source and licensed under the GNU General Public License Version 2.0. All documentation and source code for RTCGAToolbox is freely available at http://mksamur.github.io/RTCGAToolbox/ for Linux and Mac OS X operating systems.
管理来自大型项目(如癌症基因组图谱(TCGA))的数据以便进行进一步分析,对于研究项目而言是重要且耗时的一步。诸如Firehose项目等多项工作,通过网络服务和数据门户使TCGA预处理数据公开可用,但这些信息必须进行管理、下载并为后续步骤做好准备。我们已为来自Firehouse的预处理数据开发了一个基于R的开源且可扩展的数据客户端,并通过示例案例研究展示其用法。结果表明,我们的RTCGAToolbox能够为有兴趣处理TCGA数据的研究人员提供便利的数据管理。RTCGAToolbox还可与其他分析流程集成以进行进一步的数据处理。
RTCGAToolbox是开源的,遵循GNU通用公共许可证第2.0版。RTCGAToolbox的所有文档和源代码可在http://mksamur.github.io/RTCGAToolbox/上免费获取,适用于Linux和Mac OS X操作系统。