Ramanujan Computing Centre, Anna University, Chennai 600025, India.
Department of Computer Science and Engineering, Anna University, Chennai 600025, India.
Comput Math Methods Med. 2021 May 17;2021:6662420. doi: 10.1155/2021/6662420. eCollection 2021.
A computer-aided diagnosis (CAD) system that employs a super learner to diagnose the presence or absence of a disease has been developed. Each clinical dataset is preprocessed and split into training set (60%) and testing set (40%). A wrapper approach that uses three bioinspired algorithms, namely, cat swarm optimization (CSO), krill herd (KH) ,and bacterial foraging optimization (BFO) with the classification accuracy of support vector machine (SVM) as the fitness function has been used for feature selection. The selected features of each bioinspired algorithm are stored in three separate databases. The features selected by each bioinspired algorithm are used to train three back propagation neural networks (BPNN) independently using the conjugate gradient algorithm (CGA). Classifier testing is performed by using the testing set on each trained classifier, and the diagnostic results obtained are used to evaluate the performance of each classifier. The classification results obtained for each instance of the testing set of the three classifiers and the class label associated with each instance of the testing set will be the candidate instances for training and testing the super learner. The training set comprises of 80% of the instances, and the testing set comprises of 20% of the instances. Experimentation has been carried out using seven clinical datasets from the University of California Irvine (UCI) machine learning repository. The super learner has achieved a classification accuracy of 96.83% for Wisconsin diagnostic breast cancer dataset (WDBC), 86.36% for Statlog heart disease dataset (SHD), 94.74% for hepatocellular carcinoma dataset (HCC), 90.48% for hepatitis dataset (HD), 81.82% for vertebral column dataset (VCD), 84% for Cleveland heart disease dataset (CHD), and 70% for Indian liver patient dataset (ILP).
已经开发出一种使用超级学习者来诊断疾病存在与否的计算机辅助诊断 (CAD) 系统。每个临床数据集都经过预处理,并分为训练集(60%)和测试集(40%)。使用三种仿生算法,即猫群优化(CSO)、磷虾群(KH)和细菌觅食优化(BFO),并以支持向量机(SVM)的分类准确性作为适应度函数的包装方法用于特征选择。每个仿生算法选择的特征都存储在三个单独的数据库中。每个仿生算法选择的特征用于使用共轭梯度算法(CGA)独立地训练三个反向传播神经网络(BPNN)。通过使用每个训练分类器的测试集来执行分类器测试,并使用获得的诊断结果来评估每个分类器的性能。使用三个分类器的测试集的每个实例的分类结果和与测试集的每个实例相关联的类别标签将是用于训练和测试超级学习者的候选实例。训练集由 80%的实例组成,测试集由 20%的实例组成。使用来自加利福尼亚大学欧文分校 (UCI) 机器学习存储库的七个临床数据集进行了实验。超级学习者在威斯康星州诊断乳腺癌数据集 (WDBC) 中达到了 96.83%的分类准确性,在 Statlog 心脏病数据集 (SHD) 中达到了 86.36%,在肝细胞癌数据集 (HCC) 中达到了 94.74%,在肝炎数据集 (HD) 中达到了 90.48%,在脊椎数据集 (VCD) 中达到了 81.82%,在克利夫兰心脏病数据集 (CHD) 中达到了 84%,在印度肝病患者数据集 (ILP) 中达到了 70%。