Computational and Integrative Biomedical Research Center, Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, TX, USA.
Department of Statistics and Electrical & Computer Engineering, Rice University, Houston, TX, USA and Department of Pediatrics-Neurology, Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Baylor College of Medicine, Houston, TX, USA.
Bioinformatics. 2016 Mar 15;32(6):952-4. doi: 10.1093/bioinformatics/btv677. Epub 2015 Nov 14.
Massive amounts of high-throughput genomics data profiled from tumor samples were made publicly available by the Cancer Genome Atlas (TCGA).
We have developed an open source software package, TCGA2STAT, to obtain the TCGA data, wrangle it, and pre-process it into a format ready for multivariate and integrated statistical analysis in the R environment. In a user-friendly format with one single function call, our package downloads and fully processes the desired TCGA data to be seamlessly integrated into a computational analysis pipeline. No further technical or biological knowledge is needed to utilize our software, thus making TCGA data easily accessible to data scientists without specific domain knowledge.
TCGA2STAT is available from the https://cran.r-project.org/web/packages/TCGA2STAT/index.html
Supplementary data are available at Bioinformatics online.
癌症基因组图谱 (TCGA) 提供了大量经过高通量基因组学分析的肿瘤样本的公开数据。
我们开发了一个开源软件包 TCGA2STAT,用于获取 TCGA 数据、处理数据并将其预处理为适用于 R 环境中多元和综合统计分析的格式。通过一个用户友好的格式和单个函数调用,我们的软件包可以下载并完全处理所需的 TCGA 数据,以便无缝集成到计算分析管道中。利用我们的软件不需要特定领域的知识或技术知识,因此使没有特定领域知识的数据科学家也可以轻松访问 TCGA 数据。
TCGA2STAT 可从 https://cran.r-project.org/web/packages/TCGA2STAT/index.html 获得。
补充数据可在 Bioinformatics 在线获得。