Vezakis Andreas, Vezakis Ioannis, Petropoulou Ourania, Miloulis Stavros T, Anastasiou Athanasios, Kakkos Ioannis, Matsopoulos George K
Biomedical Engineering Laboratory, School of Electrical & Computer Engineering, National Technical University of Athens, 15773 Athens, Greece.
Department of Biomedical Engineering, University of West Attica, 12243 Athens, Greece.
Bioengineering (Basel). 2025 Apr 13;12(4):413. doi: 10.3390/bioengineering12040413.
Ulcerative colitis (UC) is a chronic inflammatory bowel disease characterized by continuous inflammation of the colon and rectum. Accurate disease assessment is essential for effective treatment, with endoscopic evaluation, particularly the Mayo Endoscopic Score (MES), serving as a key diagnostic tool. However, MES measurement can be subjective and inconsistent, leading to variability in treatment decisions. Deep learning approaches have shown promise in providing more objective and standardized assessments of UC severity.
This study utilized publicly available endoscopic images of UC patients to analyze and compare the performance of state-of-the-art deep neural networks for automated MES classification. Several state-of-the-art architectures were tested to determine the most effective model for grading disease severity. The F1 score, accuracy, recall, and precision were calculated for all models, and statistical analysis was conducted to verify statistically significant differences between the networks.
VGG19 was found to be the best-performing network, achieving a QWK score of 0.876 and a macro-averaged F1 score of 0.7528 across all classes. However, the performance differences among the top-performing models were very small suggesting that selection should depend on specific deployment requirements.
This study demonstrates that multiple state-of-the-art deep neural network architectures could automate UC severity classification. Simpler architectures were found to achieve competitive results with larger models, challenging the assumption that larger networks necessarily provide better clinical outcomes.
溃疡性结肠炎(UC)是一种慢性炎症性肠病,其特征为结肠和直肠的持续性炎症。准确的疾病评估对于有效治疗至关重要,内镜评估,尤其是梅奥内镜评分(MES),是关键的诊断工具。然而,MES测量可能具有主观性且不一致,导致治疗决策存在差异。深度学习方法在提供更客观和标准化的UC严重程度评估方面显示出前景。
本研究利用公开可得的UC患者内镜图像,分析和比较用于自动MES分类的最先进深度神经网络的性能。测试了几种最先进的架构,以确定用于疾病严重程度分级的最有效模型。计算了所有模型的F1分数、准确率、召回率和精确率,并进行了统计分析以验证各网络之间的统计学显著差异。
发现VGG19是表现最佳的网络,在所有类别中实现了0.876的QWK分数和0.7528的宏平均F1分数。然而,表现最佳的模型之间的性能差异非常小,这表明应根据具体的部署要求进行选择。
本研究表明,多种最先进的深度神经网络架构可以实现UC严重程度分类的自动化。发现较简单的架构与较大的模型取得了具有竞争力的结果,这对更大的网络必然能提供更好临床结果的假设提出了挑战。