Suppr超能文献

基于基因表达的白血病亚分类:使用委员会神经网络

Gene expression based leukemia sub-classification using committee neural networks.

作者信息

Sewak Mihir S, Reddy Narender P, Duan Zhong-Hui

机构信息

Department of Biomedical Engineering, University of Akron, Akron, OH 44325-0302.

出版信息

Bioinform Biol Insights. 2009 Sep 3;3:89-98. doi: 10.4137/bbi.s2908.

Abstract

Analysis of gene expression data provides an objective and efficient technique for sub-classification of leukemia. The purpose of the present study was to design a committee neural networks based classification systems to subcategorize leukemia gene expression data. In the study, a binary classification system was considered to differentiate acute lymphoblastic leukemia from acute myeloid leukemia. A ternary classification system which classifies leukemia expression data into three subclasses including B-cell acute lymphoblastic leukemia, T-cell acute lymphoblastic leukemia and acute myeloid leukemia was also developed. In each classification system gene expression profiles of leukemia patients were first subjected to a sequence of simple preprocessing steps. This resulted in filtering out approximately 95 percent of the non-informative genes. The remaining 5 percent of the informative genes were used to train a set of artificial neural networks with different parameters and architectures. The networks that gave the best results during initial testing were recruited into a committee. The committee decision was by majority voting. The committee neural network system was later evaluated using data not used in training. The binary classification system classified microarray gene expression profiles into two categories with 100 percent accuracy and the ternary system correctly predicted the three subclasses of leukemia in over 97 percent of the cases.

摘要

基因表达数据分析为白血病的亚分类提供了一种客观有效的技术。本研究的目的是设计一种基于委员会神经网络的分类系统,对白血病基因表达数据进行亚分类。在该研究中,考虑了一种二元分类系统来区分急性淋巴细胞白血病和急性髓细胞白血病。还开发了一种三元分类系统,将白血病表达数据分为三个亚类,包括B细胞急性淋巴细胞白血病、T细胞急性淋巴细胞白血病和急性髓细胞白血病。在每个分类系统中,首先对白血病患者的基因表达谱进行一系列简单的预处理步骤。这导致约95%的非信息基因被过滤掉。其余5%的信息基因用于训练一组具有不同参数和架构的人工神经网络。在初始测试中给出最佳结果的网络被纳入一个委员会。委员会的决策采用多数投票。委员会神经网络系统随后使用未用于训练的数据进行评估。二元分类系统将微阵列基因表达谱分为两类,准确率达100%,三元系统在超过97%的病例中正确预测了白血病的三个亚类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3da3/2808175/ff32b0ca38c3/bbi-2009-089f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验