Department of Biostatistics and Epidemiology, Memorial Sloan Kettering Cancer Center, New York, NY, United States of America.
Department of Cell Biology, Harvard Medical School, Boston, MA, United States of America.
PLoS One. 2020 Nov 2;15(11):e0234669. doi: 10.1371/journal.pone.0234669. eCollection 2020.
Large-scale sequencing projects, such as The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC), have generated high throughput sequencing and molecular profiling data sets, but it is still challenging to identify potentially causal changes in cellular processes in cancer as well as in other diseases in an automated fashion. We developed the netboxr package written in the R programming language, which makes use of the NetBox algorithm to identify candidate cancer-related functional modules. The algorithm makes use of a data-driven, network-based approach that combines prior knowledge with a network clustering algorithm, obviating the need for and the limitation of independently curated functionally labeled gene sets. The method can combine multiple data types, such as mutations and copy number alterations, leading to more reliable identification of functional modules. We make the tool available in the Bioconductor R ecosystem for applications in cancer research and cell biology.
The netboxr package is free and open-sourced under the GNU GPL-3 license R package available at https://www.bioconductor.org/packages/release/bioc/html/netboxr.html.
大规模测序项目,如癌症基因组图谱(TCGA)和国际癌症基因组联盟(ICGC),已经生成了高通量测序和分子分析数据集,但仍然难以自动识别癌症以及其他疾病中细胞过程中潜在的因果变化。我们开发了一个用 R 编程语言编写的 netboxr 包,它利用 NetBox 算法来识别候选的癌症相关功能模块。该算法利用一种数据驱动的、基于网络的方法,将先验知识与网络聚类算法相结合,避免了对独立 curated 功能标记基因集的需求和限制。该方法可以结合多种数据类型,如突变和拷贝数改变,从而更可靠地识别功能模块。我们将该工具在 Bioconductor R 生态系统中提供,用于癌症研究和细胞生物学的应用。
netboxr 包是免费的,在 GNU GPL-3 许可证下开源,R 包可在 https://www.bioconductor.org/packages/release/bioc/html/netboxr.html 获得。