Muhsin Zahra J, Qahwaji Rami, Ghafir Ibrahim, AlShawabkeh Mo'ath, Al Bdour Muawyah, AlRyalat Saif, Al-Taee Majid
Faculty of Engineering and Digital Technologies, University of Bradford, Bradford, UK.
Department of Special Surgery, Faculty of Medicine, The Hashemite University, Zarqa, Jordan.
Comput Biol Med. 2025 Sep;195:110568. doi: 10.1016/j.compbiomed.2025.110568. Epub 2025 Jun 25.
Accurate staging of keratoconus (KC) is crucial for timely intervention and improving patient quality of life. Unlike prior studies that relied on traditional base machine learning (ML) models, this paper proposes a more advanced two-stage ensemble learning model, designed to automate KC severity staging and track disease progression with improved performance.
A clinical dataset collected from Pentacam corneal tomography devices serves as a comprehensive source of corneal data. Following extensive pre-processing, key Pentacam indices strongly correlated with KC severity staging are identified and clinically validated through a rigorous feature selection process. These selected indices are used to train, validate and optimize a two-stage ensemble learner that combines the strengths of four top-performing base ML models-Random Forest (RF), Gradient Boost (GB), Decision Tree (DT), and Support Vector Machine (SVM)-for KC severity staging. Three of these base learners are stacked to leverage their complementary strengths, with their predictions aggregated into a new feature matrix. This matrix is then passed as input to the fourth model, a meta-classifier, which generates the final KC staging results.
Experimental evaluation of the proposed ensemble learner achieved superior performance compared to previous studies. This approach achieved an overall validation accuracy of 99.41 %, a precision of 99.43 %, and a sensitivity of 99.41 %. The F1 and F2 scores were 99.42 % and 99.41 %, respectively. The classification quality, measured by Matthew's Correlation Coefficient, also attained a value of 0.993. Additionally, the model was evaluated on 100 previously unseen test samples, which were entirely excluded from training and cross-validation. It achieved an accuracy of 99 %, demonstrating exceptional consistency, robustness, and generalizability in distinguishing among the distinct stages of KC severity (0-4).
The proposed model, developed in collaboration with clinicians, provides a robust foundation for creating a reliable and practical diagnostic tool to detect KC severity stages, track disease progression over time, and evaluate the effectiveness of specific treatments.
圆锥角膜(KC)的准确分期对于及时干预和提高患者生活质量至关重要。与以往依赖传统基础机器学习(ML)模型的研究不同,本文提出了一种更先进的两阶段集成学习模型,旨在自动进行KC严重程度分期并跟踪疾病进展,且性能有所提升。
从Pentacam角膜断层扫描设备收集的临床数据集是角膜数据的全面来源。经过广泛的预处理后,识别出与KC严重程度分期密切相关的关键Pentacam指标,并通过严格的特征选择过程进行临床验证。这些选定的指标用于训练、验证和优化一个两阶段集成学习器,该学习器结合了四个表现最佳的基础ML模型——随机森林(RF)、梯度提升(GB)、决策树(DT)和支持向量机(SVM)——的优势,用于KC严重程度分期。其中三个基础学习器进行堆叠以利用它们的互补优势,它们的预测结果汇总成一个新的特征矩阵。然后将该矩阵作为输入传递给第四个模型,即元分类器,它生成最终的KC分期结果。
与先前的研究相比,对所提出的集成学习器进行的实验评估取得了卓越的性能。该方法的总体验证准确率为99.41%,精确率为99.43%,灵敏度为99.41%。F1和F2分数分别为99.42%和99.41%。通过马修斯相关系数衡量的分类质量也达到了0.993。此外,该模型在100个先前未见过的测试样本上进行了评估,这些样本完全排除在训练和交叉验证之外。它的准确率达到了99%,在区分KC严重程度的不同阶段(0 - 4)时表现出了卓越的一致性、稳健性和泛化能力。
与临床医生合作开发的所提出的模型为创建一个可靠且实用的诊断工具奠定了坚实基础,该工具可检测KC严重程度阶段、跟踪疾病随时间的进展并评估特定治疗的有效性。