Department of Computational Biomedicine, Vingroup Big Data Institute, Hanoi, Vietnam.
Faculty of Pharmacy, Dainam University, Hanoi, Vietnam.
Sci Rep. 2020 Nov 25;10(1):20521. doi: 10.1038/s41598-020-77318-1.
The cumulative of genes carrying mutations is vital for the establishment and development of cancer. However, this driver gene exploring research line has selected and used types of tools and models of analysis unsystematically and discretely. Also, the previous studies may have neglected low-frequency drivers and seldom predicted subgroup specificities of identified driver genes. In this study, we presented an improved driver gene identification and analysis pipeline that comprises the four most widely focused analyses for driver genes: enrichment analysis, clinical feature association with expression profiles of identified driver genes as well as with their functional modules, and patient stratification by existing advanced computational tools integrating multi-omics data. The improved pipeline's general usability was demonstrated straightforwardly for breast cancer, validated by some independent databases. Accordingly, 31 validated driver genes, including four novel ones, were discovered. Subsequently, we detected cancer-related significantly enriched gene ontology terms and pathways, probable drug targets, two co-expressed modules associated significantly with several clinical features, such as number of positive lymph nodes, Nottingham prognostic index, and tumor stage, and two biologically distinct groups of BRCA patients. Data and source code of the case study can be downloaded at https://github.com/hauldhut/drivergene .
基因突变的累积对于癌症的发生和发展至关重要。然而,这条驱动基因的研究路线在选择和使用各种工具和分析模型时缺乏系统性和连贯性。此外,以前的研究可能忽略了低频驱动基因,并且很少预测已识别的驱动基因的亚组特异性。在这项研究中,我们提出了一种改进的驱动基因识别和分析管道,该管道包括最广泛关注的四种驱动基因分析:富集分析、与识别的驱动基因的表达谱以及其功能模块的临床特征关联,以及利用现有的先进计算工具整合多组学数据对患者进行分层。改进的管道的通用性在乳腺癌中得到了直接验证,并通过一些独立的数据库进行了验证。因此,发现了 31 个经过验证的驱动基因,包括 4 个新的驱动基因。随后,我们检测到了与癌症相关的显著富集的基因本体论术语和途径、可能的药物靶点、两个与多个临床特征(如阳性淋巴结数量、诺丁汉预后指数和肿瘤分期)显著相关的共表达模块,以及两个具有明显生物学差异的 BRCA 患者群体。案例研究的数据和源代码可在 https://github.com/hauldhut/drivergene 上下载。