Department of Science, University of Sannio, 82100 Benevento, Italy.
Bioinformatics. 2012 Oct 1;28(19):2512-4. doi: 10.1093/bioinformatics/bts453. Epub 2012 Jul 18.
SUMMARY: Identification of genetic alterations of tumor cells has become a common method to detect the genes involved in development and progression of cancer. In order to detect driver genes, several samples need to be simultaneously analyzed. The Cancer Genome Atlas (TCGA) project provides access to a large amount of data for several cancer types. TGCA is an invaluable source of information, but analysis of this huge dataset possess important computational problems in terms of memory and execution times. Here, we present a R/package, called VegaMC (Vega multi-channel), that enables fast and efficient detection of significant recurrent copy number alterations in very large datasets. VegaMC is integrated with the output of the common tools that convert allele signal intensities in log R ratio and B allele frequency. It also enables the detection of loss of heterozigosity and provides in output two web pages allowing a rapid and easy navigation of the aberrant genes. Synthetic data and real datasets are used for quantitative and qualitative evaluation purposes. In particular, we demonstrate the ability of VegaMC on two large TGCA datasets: colon adenocarcinoma and glioblastoma multiforme. For both the datasets, we provide the list of aberrant genes which contain previously validated genes and can be used as basis for further investigations. AVAILABILITY: VegaMC is a R/Bioconductor Package, available at http://bioconductor.org/packages/release/bioc/html/VegaMC.html. CONTACT: morganella@unisannio.it SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
摘要:鉴定肿瘤细胞的遗传改变已成为检测癌症发生和发展相关基因的常用方法。为了检测驱动基因,需要同时分析多个样本。癌症基因组图谱(TCGA)项目为多种癌症类型提供了大量数据。TCGA 是一个非常有价值的信息来源,但分析这个庞大的数据集在内存和执行时间方面存在重要的计算问题。在这里,我们提出了一个名为 VegaMC(Vega 多通道)的 R 包,它能够快速有效地检测非常大的数据集的显著复发性拷贝数改变。VegaMC 与将等位基因信号强度转换为对数 R 比和 B 等位基因频率的常用工具的输出集成在一起。它还能够检测杂合性丢失,并提供两个网页的输出,允许快速方便地浏览异常基因。使用合成数据和真实数据集进行定量和定性评估。特别是,我们在两个大型 TCGA 数据集:结肠腺癌和胶质母细胞瘤中展示了 VegaMC 的能力。对于这两个数据集,我们提供了包含先前验证基因的异常基因列表,可作为进一步研究的基础。
可用性:VegaMC 是一个 R/Bioconductor 包,可在 http://bioconductor.org/packages/release/bioc/html/VegaMC.html 获得。
补充信息:补充数据可在生物信息学在线获得。
Bioinformatics. 2010-10-19
Bioinformatics. 2009-5-1
Bioinformatics. 2011-8-25
Bioinformatics. 2015-9-15
BMC Bioinformatics. 2015-9-15
Bioinformatics. 2014-4-3
Bioinformatics. 2010-4-23
PLoS One. 2016-8-25
Cancers (Basel). 2020-5-31
Front Neurosci. 2012-12-24