Muhsin Zahra J, Qahwaji Rami, Ghafir Ibrahim, AlShawabkeh Mo'ath, Al Bdour Muawyah, AlRyalat Saif Aldeen, Al-Taee Majid
Faculty of Engineering and Digital Technologies, University of Bradford, Bradford, BD7 1DP, UK.
Department of Ophthalmology, The Hashemite University, Zarqa, Jordan.
Eye Vis (Lond). 2025 Jun 24;12(1):25. doi: 10.1186/s40662-025-00440-6.
Despite extensive research on keratoconus (KC) detection with traditional machine learning models, stacking ensemble learning approaches remain underexplored. This paper presents a stacking ensemble learning method to enhance automated KC screening.
This study utilizes a clinical dataset containing detailed corneal data from 2491 cases classified as non-KC (NKC), subclinical KC (SCKC) and clinical KC (CKC). Each cornea is represented by 79 features extracted from Pentacam imaging. Following extensive pre-processing, key corneal features that are strongly correlated with the target diagnosis are identified. These features are the keratometry of the steepest anterior point, surface variance index, vertical asymmetry index, height decentration index, and height asymmetry index. A novel stacking ensemble model is developed using the selected features to improve corneal classification into NKC, SCKC, and CKC by integrating top tree-based classifiers (random forest, gradient boosting, decision trees) with a support vector machine meta-classifier.
The pre-processing and feature selection techniques reduced the model's parameters to just 6.33% of the original dataset, improving classification performance, and cutting over 85% of the training time. The performance of the developed model was validated and tested on unseen data. Experimental results showed that the model outperforms existing studies, achieving 99.72% accuracy, precision, sensitivity, F1, and F2 scores, with a Matthews correlation coefficient of 0.995. It accurately classified all NKC and CKC cases, with just one misclassification involving an SCKC case. The model also demonstrated consistent performance on 100 additional unseen test cases, underscoring its generalizability and robustness in KC screening.
By combining the strengths of diverse base models and key Pentacam indices, the stacking ensemble approach ensures reliable, accurate KC screening, providing clinicians with an automated tool for early detection and better patient management.
尽管对使用传统机器学习模型检测圆锥角膜(KC)进行了广泛研究,但堆叠集成学习方法仍未得到充分探索。本文提出了一种堆叠集成学习方法来加强圆锥角膜的自动筛查。
本研究使用了一个临床数据集,其中包含来自2491例被分类为非圆锥角膜(NKC)、亚临床圆锥角膜(SCKC)和临床圆锥角膜(CKC)病例的详细角膜数据。每个角膜由从Pentacam成像中提取的79个特征表示。经过广泛的预处理后,识别出与目标诊断密切相关的关键角膜特征。这些特征是最陡前表面的角膜曲率、表面方差指数、垂直不对称指数、高度偏心指数和高度不对称指数。利用选定的特征开发了一种新型堆叠集成模型,通过将基于树的顶级分类器(随机森林、梯度提升、决策树)与支持向量机元分类器相结合,改进角膜分类为NKC、SCKC和CKC。
预处理和特征选择技术将模型参数减少到原始数据集的6.33%,提高了分类性能,并减少了超过85%的训练时间。所开发模型的性能在未见数据上进行了验证和测试。实验结果表明,该模型优于现有研究,准确率、精确率、灵敏度、F1和F2分数达到99.72%,马修斯相关系数为0.995。它准确地对所有NKC和CKC病例进行了分类,只有一例涉及SCKC病例的错误分类。该模型在另外100个未见测试病例上也表现出一致的性能,突出了其在圆锥角膜筛查中的通用性和稳健性。
通过结合不同基础模型的优势和关键的Pentacam指标(眼前节分析系统),堆叠集成方法确保了可靠、准确的圆锥角膜筛查,为临床医生提供了一种用于早期检测和更好患者管理的自动化工具。