Research Scholar, Ramanujan Computing Centre, College of Engineering Guindy, Anna University, Chennai 600025, Tamil Nadu, India.
Professor, Ramanujan Computing Centre, College of Engineering Guindy, Anna University, Chennai 600025, Tamil Nadu, India.
Comput Math Methods Med. 2019 Sep 23;2019:7398307. doi: 10.1155/2019/7398307. eCollection 2019.
A framework for clinical diagnosis which uses bioinspired algorithms for feature selection and gradient descendant backpropagation neural network for classification has been designed and implemented. The clinical data are subjected to data preprocessing, feature selection, and classification. Hot deck imputation has been used for handling missing values and min-max normalization is used for data transformation. Wrapper approach that employs bioinspired algorithms, namely, Differential Evolution, Lion Optimization, and Glowworm Swarm Optimization with accuracy of AdaBoostSVM classifier as fitness function has been used for feature selection. Each bioinspired algorithm selects a subset of features yielding three feature subsets. Correlation-based ensemble feature selection is performed to select the optimal features from the three feature subsets. The optimal features selected through correlation-based ensemble feature selection are used to train a gradient descendant backpropagation neural network. Ten-fold cross-validation technique has been used to train and test the performance of the classifier. Hepatitis dataset and Wisconsin Diagnostic Breast Cancer (WDBC) dataset from University of California Irvine (UCI) Machine Learning repository have been used to evaluate the classification accuracy. An accuracy of 98.47% is obtained for Wisconsin Diagnostic Breast Cancer dataset, and 95.51% is obtained for Hepatitis dataset. The proposed framework can be tailored to develop clinical decision-making systems for any health disorders to assist physicians in clinical diagnosis.
已经设计并实现了一种使用生物启发算法进行特征选择和梯度下降反向传播神经网络进行分类的临床诊断框架。临床数据经过数据预处理、特征选择和分类。使用热插补法处理缺失值,使用最小-最大归一化法进行数据转换。采用包装方法,使用生物启发算法(即差分进化、狮子优化和萤火虫群优化),以 AdaBoostSVM 分类器的准确性作为适应度函数进行特征选择。每个生物启发算法选择一组特征,产生三个特征子集。通过基于相关性的集成特征选择从三个特征子集中选择最佳特征。通过基于相关性的集成特征选择选择的最佳特征用于训练梯度下降反向传播神经网络。使用十折交叉验证技术来训练和测试分类器的性能。使用来自加利福尼亚大学欧文分校(UCI)机器学习存储库的肝炎数据集和威斯康星州诊断乳腺癌(WDBC)数据集来评估分类准确性。对于威斯康星州诊断乳腺癌数据集,获得了 98.47%的准确性,对于肝炎数据集,获得了 95.51%的准确性。该框架可以根据需要定制,以开发用于任何健康障碍的临床决策支持系统,以帮助医生进行临床诊断。