Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China.
College of Mathematics and System Sciences, Xinjiang University, Urumqi, China.
BMC Bioinformatics. 2021 Sep 24;22(Suppl 3):457. doi: 10.1186/s12859-021-04377-0.
As one of the deadliest diseases in the world, cancer is driven by a few somatic mutations that disrupt the normal growth of cells, and leads to abnormal proliferation and tumor development. The vast majority of somatic mutations did not affect the occurrence and development of cancer; thus, identifying the mutations responsible for tumor occurrence and development is one of the main targets of current cancer treatments.
To effectively identify driver genes, we adopted a semi-local centrality measure and gene mutation effect function to assess the effect of gene mutations on changes in gene expression patterns. Firstly, we calculated the mutation score for each gene. Secondly, we identified differentially expressed genes (DEGs) in the cohort by comparing the expression profiles of tumor samples and normal samples, and then constructed a local network for each mutation gene using DEGs and mutant genes according to the protein-protein interaction network. Finally, we calculated the score of each mutant gene according to the objective function. The top-ranking mutant genes were selected as driver genes. We name the proposed method as mutations effect and network centrality.
Four types of cancer data in The Cancer Genome Atlas were tested. The experimental data proved that our method was superior to the existing network-centric method, as it was able to quickly and easily identify driver genes and rare driver factors.
癌症作为世界上最致命的疾病之一,是由少数体细胞突变驱动的,这些突变会破坏细胞的正常生长,导致异常增殖和肿瘤发展。绝大多数体细胞突变不会影响癌症的发生和发展;因此,鉴定导致肿瘤发生和发展的突变是当前癌症治疗的主要目标之一。
为了有效地鉴定驱动基因,我们采用了一种半局部中心性度量和基因突变效应函数来评估基因突变对基因表达模式变化的影响。首先,我们计算了每个基因的突变分数。其次,我们通过比较肿瘤样本和正常样本的表达谱,在队列中鉴定了差异表达基因(DEGs),然后根据蛋白质-蛋白质相互作用网络,使用 DEGs 和突变基因为每个突变基因构建局部网络。最后,根据目标函数计算每个突变基因的分数。排名靠前的突变基因被选为驱动基因。我们将所提出的方法命名为突变效应和网络中心性。
我们测试了来自癌症基因组图谱的四种类型的癌症数据。实验数据证明,我们的方法优于现有的基于网络的方法,因为它能够快速、轻松地识别驱动基因和罕见的驱动因素。