Sewak Mihir S, Reddy Narender P, Duan Zhong-Hui
Department of Biomedical Engineering, University of Akron, Akron, OH 44325-0302.
Bioinform Biol Insights. 2009 Sep 3;3:89-98. doi: 10.4137/bbi.s2908.
Analysis of gene expression data provides an objective and efficient technique for sub-classification of leukemia. The purpose of the present study was to design a committee neural networks based classification systems to subcategorize leukemia gene expression data. In the study, a binary classification system was considered to differentiate acute lymphoblastic leukemia from acute myeloid leukemia. A ternary classification system which classifies leukemia expression data into three subclasses including B-cell acute lymphoblastic leukemia, T-cell acute lymphoblastic leukemia and acute myeloid leukemia was also developed. In each classification system gene expression profiles of leukemia patients were first subjected to a sequence of simple preprocessing steps. This resulted in filtering out approximately 95 percent of the non-informative genes. The remaining 5 percent of the informative genes were used to train a set of artificial neural networks with different parameters and architectures. The networks that gave the best results during initial testing were recruited into a committee. The committee decision was by majority voting. The committee neural network system was later evaluated using data not used in training. The binary classification system classified microarray gene expression profiles into two categories with 100 percent accuracy and the ternary system correctly predicted the three subclasses of leukemia in over 97 percent of the cases.
基因表达数据分析为白血病的亚分类提供了一种客观有效的技术。本研究的目的是设计一种基于委员会神经网络的分类系统,对白血病基因表达数据进行亚分类。在该研究中,考虑了一种二元分类系统来区分急性淋巴细胞白血病和急性髓细胞白血病。还开发了一种三元分类系统,将白血病表达数据分为三个亚类,包括B细胞急性淋巴细胞白血病、T细胞急性淋巴细胞白血病和急性髓细胞白血病。在每个分类系统中,首先对白血病患者的基因表达谱进行一系列简单的预处理步骤。这导致约95%的非信息基因被过滤掉。其余5%的信息基因用于训练一组具有不同参数和架构的人工神经网络。在初始测试中给出最佳结果的网络被纳入一个委员会。委员会的决策采用多数投票。委员会神经网络系统随后使用未用于训练的数据进行评估。二元分类系统将微阵列基因表达谱分为两类,准确率达100%,三元系统在超过97%的病例中正确预测了白血病的三个亚类。