Department of Computer Science, University of Craiova, Craiova 200585, Romania.
Royal Society of Medicine, United Kingdom.
J Biomed Inform. 2018 Jul;83:159-166. doi: 10.1016/j.jbi.2018.06.003. Epub 2018 Jun 15.
Methods based on microarrays (MA), mass spectrometry (MS), and machine learning (ML) algorithms have evolved rapidly in recent years, allowing for early detection of several types of cancer. A pitfall of these approaches, however, is the overfitting of data due to large number of attributes and small number of instances -- a phenomenon known as the 'curse of dimensionality'. A potentially fruitful idea to avoid this drawback is to develop algorithms that combine fast computation with a filtering module for the attributes. The goal of this paper is to propose a statistical strategy to initiate the hidden nodes of a single-hidden layer feedforward neural network (SLFN) by using both the knowledge embedded in data and a filtering mechanism for attribute relevance. In order to attest its feasibility, the proposed model has been tested on five publicly available high-dimensional datasets: breast, lung, colon, and ovarian cancer regarding gene expression and proteomic spectra provided by cDNA arrays, DNA microarray, and MS. The novel algorithm, called adaptive SLFN (aSLFN), has been compared with four major classification algorithms: traditional ELM, radial basis function network (RBF), single-hidden layer feedforward neural network trained by backpropagation algorithm (BP-SLFN), and support vector-machine (SVM). Experimental results showed that the classification performance of aSLFN is competitive with the comparison models.
近年来,基于微阵列(MA)、质谱(MS)和机器学习(ML)算法的方法迅速发展,能够早期检测多种类型的癌症。然而,这些方法的一个陷阱是由于属性数量多而实例数量少而导致数据过度拟合——这一现象被称为“维度诅咒”。一种避免这一缺点的潜在有效方法是开发将快速计算与属性过滤模块相结合的算法。本文的目的是提出一种统计策略,通过使用数据中嵌入的知识和属性相关性的过滤机制,启动单隐藏层前馈神经网络(SLFN)的隐藏节点。为了证明其可行性,已经在五个公开的高维数据集上测试了所提出的模型:乳腺癌、肺癌、结肠癌和卵巢癌,涉及 cDNA 阵列、DNA 微阵列和 MS 提供的基因表达和蛋白质组谱。该新算法称为自适应 SLFN(aSLFN),与四种主要分类算法进行了比较:传统的 ELM、径向基函数网络(RBF)、通过反向传播算法(BP-SLFN)训练的单隐藏层前馈神经网络和支持向量机(SVM)。实验结果表明,aSLFN 的分类性能与比较模型具有竞争力。