Paul Topon Kumar, Iba Hitoshi
Department of Frontier Informatics, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8561, Japan.
Biosystems. 2005 Dec;82(3):208-25. doi: 10.1016/j.biosystems.2005.07.003. Epub 2005 Aug 22.
Recently, DNA microarray-based gene expression profiles have been used to correlate the clinical behavior of cancers with the differential gene expression levels in cancerous and normal tissues. To this end, after selection of some predictive genes based on signal-to-noise (S2N) ratio, unsupervised learning like clustering and supervised learning like k-nearest neighbor (k NN) classifier are widely used. Instead of S2N ratio, adaptive searches like Probabilistic Model Building Genetic Algorithm (PMBGA) can be applied for selection of a smaller size gene subset that would classify patient samples more accurately. In this paper, we propose a new PMBGA-based method for identification of informative genes from microarray data. By applying our proposed method to classification of three microarray data sets of binary and multi-type tumors, we demonstrate that the gene subsets selected with our technique yield better classification accuracy.
最近,基于DNA微阵列的基因表达谱已被用于将癌症的临床行为与癌组织和正常组织中的差异基因表达水平相关联。为此,在基于信噪比(S2N)选择一些预测基因后,无监督学习(如聚类)和监督学习(如k近邻(k NN)分类器)被广泛使用。代替信噪比,像概率模型构建遗传算法(PMBGA)这样的自适应搜索可用于选择较小规模的基因子集,该子集能更准确地对患者样本进行分类。在本文中,我们提出了一种基于PMBGA的新方法,用于从微阵列数据中识别信息基因。通过将我们提出的方法应用于二元和多类型肿瘤的三个微阵列数据集的分类,我们证明了用我们的技术选择的基因子集产生了更好的分类准确性。