Tung W L, Quek C
Centre for Computational Intelligence, School of Computer Engineering, Nanyang Technological University, Blk N4 #2A-32, Nanyang Avenue, Singapore 639798, Singapore.
Artif Intell Med. 2005 Jan;33(1):61-88. doi: 10.1016/j.artmed.2004.03.009.
Acute lymphoblastic leukemia (ALL) is the most common malignancy of childhood, representing nearly one third of all pediatric cancers. Currently, the treatment of pediatric ALL is centered on tailoring the intensity of the therapy applied to a patient's risk of relapse, which is linked to the type of leukemia the patient has. Hence, accurate and correct diagnosis of the various leukemia subtypes becomes an important first step in the treatment process. Recently, gene expression profiling using DNA microarrays has been shown to be a viable and accurate diagnostic tool to identify the known prognostically important ALL subtypes. Thus, there is currently a huge interest in developing autonomous classification systems for cancer diagnosis using gene expression data. This is to achieve an unbiased analysis of the data and also partly to handle the large amount of genetic information extracted from the DNA microarrays.
Generally, existing medical decision support systems (DSS) for cancer classification and diagnosis are based on traditional statistical methods such as Bayesian decision theory and machine learning models such as neural networks (NN) and support vector machine (SVM). Though high accuracies have been reported for these systems, they fall short on certain critical areas. These included (a) being able to present the extracted knowledge and explain the computed solutions to the users; (b) having a logical deduction process that is similar and intuitive to the human reasoning process; and (c) flexible enough to incorporate new knowledge without running the risk of eroding old but valid information. On the other hand, a neural fuzzy system, which is synthesized to emulate the human ability to learn and reason in the presence of imprecise and incomplete information, has the ability to overcome the above-mentioned shortcomings. However, existing neural fuzzy systems have their own limitations when used in the design and implementation of DSS. Hence, this paper proposed the use of a novel neural fuzzy system: the generic self-organising fuzzy neural network (GenSoFNN) with truth-value restriction (TVR) fuzzy inference, as a fuzzy DSS (denoted as GenSo-FDSS) for the classification of ALL subtypes using gene expression data.
The performance of the GenSo-FDSS system is encouraging when benchmarked against those of NN, SVM and the K-nearest neighbor (K-NN) classifier. On average, a classification rate of above 90% has been achieved using the GenSo-FDSS system.
急性淋巴细胞白血病(ALL)是儿童期最常见的恶性肿瘤,占所有儿科癌症的近三分之一。目前,儿童ALL的治疗主要围绕根据患者复发风险调整治疗强度展开,而复发风险与患者所患白血病的类型相关。因此,准确诊断各种白血病亚型成为治疗过程中的重要第一步。最近,使用DNA微阵列进行基因表达谱分析已被证明是一种可行且准确的诊断工具,可用于识别已知的对预后有重要意义的ALL亚型。因此,目前人们对利用基因表达数据开发癌症诊断自主分类系统有着浓厚的兴趣。这是为了实现对数据的无偏分析,同时也是为了部分处理从DNA微阵列中提取的大量遗传信息。
一般来说,现有的用于癌症分类和诊断的医学决策支持系统(DSS)基于传统统计方法,如贝叶斯决策理论,以及机器学习模型,如神经网络(NN)和支持向量机(SVM)。尽管这些系统已报告有较高的准确率,但在某些关键领域仍存在不足。这些不足包括:(a)能够向用户呈现提取的知识并解释计算出的解决方案;(b)具有与人类推理过程相似且直观的逻辑推理过程;(c)足够灵活,能够纳入新知识而不会有破坏旧的但有效的信息的风险。另一方面,一种合成的神经模糊系统能够模拟人类在存在不精确和不完整信息时学习和推理的能力,有能力克服上述缺点。然而,现有的神经模糊系统在用于DSS的设计和实现时也有其自身的局限性。因此,本文提出使用一种新型神经模糊系统:具有真值限制(TVR)模糊推理的通用自组织模糊神经网络(GenSoFNN),作为使用基因表达数据对ALL亚型进行分类的模糊DSS(记为GenSo-FDSS)。
与NN、SVM和K近邻(K-NN)分类器相比,GenSo-FDSS系统的性能令人鼓舞。使用GenSo-FDSS系统平均实现了90%以上的分类率。