Topşir Aysel, Güler Ferdi, Çetin Ecesu, Burak Mehmet Furkan, Ağraz Melih
Department of Industrial Engineering, Yıldız Technical University, Davutpaşa, İstanbul, 34220, Türkiye.
Department of Data Science and Analytics, Giresun University, Giresun, Türkiye.
BMC Med Inform Decis Mak. 2025 Jul 31;25(1):284. doi: 10.1186/s12911-025-03014-7.
Thyroid disease classification is a critical challenge in medical diagnostics, requiring accurate differentiation between hyperthyroidism, hypothyroidism, and normal thyroid function. This study introduces an advanced machine learning approach that integrates generative adversarial networks (GANs) for data augmentation and Kolmogorov-Arnold networks (KANs) for classification. Various machine learning models including logistic regression, random forest, support vector machines, multilayer perceptrons, and KANs were trained and evaluated. The results indicate that the application of GAN-based data augmentation has significantly improved classification accuracy, particularly for minority classes. Specifically, the KAN model achieved an accuracy of 98.68% and random forest (RF) F1-score of 98.00%, outperforming traditional neural network applications. The results demonstrate that GAN-augmented datasets significantly improve classification accuracy, and the KAN model achieves superior performance and generalization capabilities compared to traditional neural networks. Additionally, the SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) were employed to ensure model transparency and interpretability. These explainability methods highlight thyroid stimulating hormone as the most prominent feature in classification, further supporting its clinical utility in the diagnosis of thyroid diseases. The findings underscore the potential of advanced AI-driven techniques in improving thyroid disease classification, addressing class imbalance, and enhancing explainability in healthcare applications. By leveraging synthetic data generation, this study provides a feasible framework for actual clinical application, particularly in situations where clinical data are limited or imbalanced. The integration of GANs and KANs enhances diagnostic accuracy while preserving robustness and generalizability to diverse patient populations. Besides, the approach fosters the deployment of explainable AI models in clinical decision support systems so that healthcare practitioners can make improved and more reliable decisions, thus leading to better patient outcomes and resource allocation.
甲状腺疾病分类是医学诊断中的一项关键挑战,需要准确区分甲状腺功能亢进、甲状腺功能减退和正常甲状腺功能。本研究引入了一种先进的机器学习方法,该方法集成了用于数据增强的生成对抗网络(GAN)和用于分类的柯尔莫哥洛夫 - 阿诺德网络(KAN)。对包括逻辑回归、随机森林、支持向量机、多层感知器和KAN在内的各种机器学习模型进行了训练和评估。结果表明,基于GAN的数据增强应用显著提高了分类准确率,特别是对于少数类。具体而言,KAN模型的准确率达到了98.68%,随机森林(RF)的F1分数为98.00%,优于传统神经网络应用。结果表明,GAN增强的数据集显著提高了分类准确率,并且与传统神经网络相比,KAN模型具有卓越的性能和泛化能力。此外,采用了SHAP(SHapley值加法解释)和LIME(局部可解释模型无关解释)来确保模型的透明度和可解释性。这些可解释性方法突出了促甲状腺激素是分类中最突出的特征,进一步支持了其在甲状腺疾病诊断中的临床效用。研究结果强调了先进的人工智能驱动技术在改善甲状腺疾病分类、解决类别不平衡以及增强医疗保健应用中的可解释性方面的潜力。通过利用合成数据生成,本研究为实际临床应用提供了一个可行的框架,特别是在临床数据有限或不平衡的情况下。GAN和KAN的集成提高了诊断准确性,同时保持了对不同患者群体的稳健性和泛化性。此外,该方法促进了可解释人工智能模型在临床决策支持系统中的部署,以便医疗从业者能够做出更好、更可靠的决策,从而带来更好的患者结果和资源分配。