Barakat Nahla H, Bradley Andrew P, Barakat Mohamed Nabil H
Department of Applied Information Technology, German University of Technology in Oman, Muscat 130, Oman.
IEEE Trans Inf Technol Biomed. 2010 Jul;14(4):1114-20. doi: 10.1109/TITB.2009.2039485. Epub 2010 Jan 12.
Diabetes mellitus is a chronic disease and a major public health challenge worldwide. According to the International Diabetes Federation, there are currently 246 million diabetic people worldwide, and this number is expected to rise to 380 million by 2025. Furthermore, 3.8 million deaths are attributable to diabetes complications each year. It has been shown that 80% of type 2 diabetes complications can be prevented or delayed by early identification of people at risk. In this context, several data mining and machine learning methods have been used for the diagnosis, prognosis, and management of diabetes. In this paper, we propose utilizing support vector machines (SVMs) for the diagnosis of diabetes. In particular, we use an additional explanation module, which turns the "black box" model of an SVM into an intelligible representation of the SVM's diagnostic (classification) decision. Results on a real-life diabetes dataset show that intelligible SVMs provide a promising tool for the prediction of diabetes, where a comprehensible ruleset have been generated, with prediction accuracy of 94%, sensitivity of 93%, and specificity of 94%. Furthermore, the extracted rules are medically sound and agree with the outcome of relevant medical studies.
糖尿病是一种慢性疾病,也是全球主要的公共卫生挑战。根据国际糖尿病联合会的数据,目前全球有2.46亿糖尿病患者,预计到2025年这一数字将增至3.8亿。此外,每年有380万人死于糖尿病并发症。研究表明,通过早期识别高危人群,80%的2型糖尿病并发症可以得到预防或延缓。在此背景下,多种数据挖掘和机器学习方法已被用于糖尿病的诊断、预后和管理。在本文中,我们提出利用支持向量机(SVM)进行糖尿病诊断。特别是,我们使用了一个额外的解释模块,将SVM的“黑箱”模型转化为SVM诊断(分类)决策的可理解表示。在一个真实的糖尿病数据集上的结果表明,可理解的SVM为糖尿病预测提供了一个有前景的工具,生成了一个可理解的规则集,预测准确率为94%,灵敏度为93%,特异性为94%。此外,提取的规则在医学上是合理的,与相关医学研究的结果一致。