Charles Sturt University, Australia.
Al Iraqia University, Baghdad, Iraq.
Health Informatics J. 2020 Mar;26(1):539-562. doi: 10.1177/1460458219839629. Epub 2019 Apr 11.
Medical diagnosis through classification is often critical as the medical datasets are multilabel in nature, that is, a patient may have more than one health condition: high blood pressure, obesity, and diabetes. The aim of this article is to improve the accuracy and performance of multilabel classification using multilabel feature selection and improved overlapping clustering method. The proposed system consists of Optimized Initial Cluster Centers and Enhanced Objective Function technique to reduce the number of iterations in the clustering process thereby improving the clustering performance and to improve the clustering accuracy which will result in improving the accuracy and performance of multilabel classification. Ratios of clustering distance to class distance and execution time are used as the evaluation metric for accuracy and total execution time is used as the evaluation metric for performance. Based on the different combination with the number of labels, attributes, instances, and number of clusters, different values of accuracy and performance are obtained. The results on all 10 datasets show that the proposed technique is superior to the current technique. Furthermore, on average, the proposed technique has improved the classification accuracy by 5%-7%. Furthermore, the performance of new technique is improved by decreasing the processing time by 0.5-1 s on average. The proposed system targets on improving the accuracy and performance of the multilabel classification for medical diagnosis, which consists of multilabel feature selection and enhanced overlapping clustering technique. This study provides an acceptable range of accuracy with improved processing time, which assists the doctors in medical diagnosis (high blood pressure, obesity, and diabetes) of patients.
医学诊断中的分类通常是至关重要的,因为医学数据集本质上是多标签的,即一个患者可能有多种健康状况:高血压、肥胖症和糖尿病。本文的目的是通过多标签特征选择和改进的重叠聚类方法来提高多标签分类的准确性和性能。所提出的系统包括优化初始聚类中心和增强目标函数技术,以减少聚类过程中的迭代次数,从而提高聚类性能,并提高聚类准确性,从而提高多标签分类的准确性和性能。聚类距离与类距离的比率和执行时间用作准确性的评估指标,总执行时间用作性能的评估指标。基于与标签数量、属性、实例和聚类数量的不同组合,获得了不同的准确性和性能值。在所有 10 个数据集上的结果表明,所提出的技术优于现有技术。此外,平均而言,该技术通过提高 5%-7%的分类精度来提高性能。此外,新技术的性能通过平均减少 0.5-1s 的处理时间来提高。所提出的系统旨在提高医学诊断中的多标签分类的准确性和性能,该系统由多标签特征选择和增强的重叠聚类技术组成。本研究提供了一个可接受的准确性范围,同时提高了处理时间,这有助于医生对患者(高血压、肥胖症和糖尿病)进行医学诊断。