Lu Xinguo, Deng Yong, Huang Lei, Feng Bingtao, Liao Bo
School of Information Science and Engineering, Hunan University, Changsha 410082, China; College of Mechatronics and Automation, National University of Defense Technology, Changsha 410073, China.
School of Information Science and Engineering, Hunan University, Changsha 410082, China.
J Theor Biol. 2014 Dec 7;362:75-82. doi: 10.1016/j.jtbi.2014.01.005. Epub 2014 Jan 15.
Gene expression profiles are used to recognize patient samples for cancer diagnosis and therapy. Gene selection is crucial to high recognition performance. In usual gene selection methods the genes are considered as independent individuals and the correlation among genes is not used efficiently. In this description, a co-expression modules based gene selection method for cancer recognition is proposed. First, in the cancer dataset a weighted correlation network is constructed according to the correlation between each pair of genes, different modules from this network are identified and the significant modules are selected for following exploration. Second, based on these informative modules information gain is applied to selecting the feature genes for cancer recognition. Then using LOOCV, the experiments with different classification algorithms are conducted and the results show that the proposed method makes better classification accuracy than traditional gene selection methods. At last, via gene ontology enrichment analysis the biological significance of the co-expressed genes in specific modules was verified.
基因表达谱用于识别癌症诊断和治疗的患者样本。基因选择对于高识别性能至关重要。在通常的基因选择方法中,基因被视为独立个体,基因之间的相关性未得到有效利用。在此描述中,提出了一种基于共表达模块的癌症识别基因选择方法。首先,在癌症数据集中,根据每对基因之间的相关性构建加权相关网络,识别该网络中的不同模块,并选择重要模块进行后续探索。其次,基于这些信息丰富的模块,应用信息增益来选择用于癌症识别的特征基因。然后使用留一交叉验证法,进行不同分类算法的实验,结果表明所提出的方法比传统基因选择方法具有更好的分类准确性。最后,通过基因本体富集分析验证了特定模块中共表达基因的生物学意义。