Department of Electrical and Information Engineering, University of Nairobi, Kenya.
Biomed Res Int. 2021 Oct 20;2021:4784057. doi: 10.1155/2021/4784057. eCollection 2021.
Disease diagnosis faces challenges such as misdiagnosis, lack of diagnosis, and slow diagnosis. There are several machine learning techniques that have been applied to address these challenges, where a set of symptoms is applied to a classification model that predicts the presence or absence of a disease. To improve on the performance of these techniques, this paper presents a technique which involves feature selection using principal component analysis (PCA), a hybrid kernel-based support vector machine (HKSVM) classification model and hyperparameter optimization using genetic algorithm (GA). The HKSVM in this paper introduces a new way of combining three kernels: Radial basis function (RBF), linear, and polynomial. Combining local (RBF) and global (linear and polynomial) kernels has the effect of improved model performance. This is because the local kernels are better able to distinguish points closer to each other while the global kernels are more suited to distinguish points that are far away from each other. The PCA-GA-HKSVM is used on 7 different medical datasets, with two datasets being multiclass datasets and 5 datasets being binary. Performance evaluation metrics used were accuracy, precision, and recall. It was observed that the PCA-GA-HKSVM offered better performance than the single kernel support vector machines (SVMs).
疾病诊断面临误诊、漏诊和诊断缓慢等挑战。有几种机器学习技术已被应用于解决这些挑战,其中一组症状被应用于分类模型,该模型预测疾病的存在或不存在。为了提高这些技术的性能,本文提出了一种使用主成分分析(PCA)、混合核支持向量机(HKSVM)分类模型和遗传算法(GA)进行超参数优化的技术。本文中的 HKSVM 引入了一种组合三种核函数的新方法:径向基函数(RBF)、线性和多项式。组合局部(RBF)和全局(线性和多项式)核函数可以提高模型性能。这是因为局部核函数更善于区分彼此更接近的点,而全局核函数更适合区分彼此相距较远的点。PCA-GA-HKSVM 应用于 7 个不同的医学数据集,其中两个数据集是多类数据集,5 个数据集是二进制数据集。使用的性能评估指标是准确性、精度和召回率。观察到 PCA-GA-HKSVM 的性能优于单一核支持向量机(SVM)。