School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.
Sci Rep. 2017 Aug 17;7(1):8584. doi: 10.1038/s41598-017-08881-3.
It is urgent to diagnose colorectal cancer in the early stage. Some feature genes which are important to colorectal cancer development have been identified. However, for the early stage of colorectal cancer, less is known about the identity of specific cancer genes that are associated with advanced clinical stage. In this paper, we conducted a feature extraction method named Optimal Mean based Block Robust Feature Extraction method (OMBRFE) to identify feature genes associated with advanced colorectal cancer in clinical stage by using the integrated colorectal cancer data. Firstly, based on the optimal mean and L -norm, a novel feature extraction method called Optimal Mean based Robust Feature Extraction method (OMRFE) is proposed to identify feature genes. Then the OMBRFE method which introduces the block ideology into OMRFE method is put forward to process the colorectal cancer integrated data which includes multiple genomic data: copy number alterations, somatic mutations, methylation expression alteration, as well as gene expression changes. Experimental results demonstrate that the OMBRFE is more effective than previous methods in identifying the feature genes. Moreover, genes identified by OMBRFE are verified to be closely associated with advanced colorectal cancer in clinical stage.
早期诊断结直肠癌至关重要。已经确定了一些对结直肠癌发展很重要的特征基因。然而,对于结直肠癌的早期阶段,对于与晚期临床阶段相关的特定癌症基因的身份了解较少。在本文中,我们通过使用综合结直肠癌数据,采用一种名为基于最优均值的块稳健特征提取方法(Optimal Mean based Block Robust Feature Extraction method,OMBRFE)来识别与晚期结直肠癌相关的特征基因。首先,基于最优均值和 L-范数,提出了一种新的特征提取方法,称为基于最优均值的稳健特征提取方法(Optimal Mean based Robust Feature Extraction method,OMRFE),用于识别特征基因。然后,将块思想引入 OMRFE 方法中,提出了 OMBRFE 方法来处理包含多个基因组数据的结直肠癌综合数据:拷贝数改变、体细胞突变、甲基化表达改变以及基因表达变化。实验结果表明,OMBRFE 在识别特征基因方面比以前的方法更有效。此外,通过 OMBRFE 识别的基因被验证与晚期结直肠癌的临床阶段密切相关。