Machine & Hybrid Intelligence Lab, Department of Radiology, Northwestern University, Chicago, USA.
Indian Institute of Technology, Guwahati, India.
Med Image Anal. 2025 Jan;99:103307. doi: 10.1016/j.media.2024.103307. Epub 2024 Sep 5.
Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Therefore, there is a need for an automated system that can flag missed polyps during the examination and improve patient care. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time, improving the accuracy of diagnosis and enhancing treatment. In addition to the algorithm's accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm's prediction. Further, conclusions based on incorrect decisions may be fatal, especially in medicine. Despite these pitfalls, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the "Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. The Medico 2020 challenge received submissions from 17 teams, while the MedAI 2021 challenge also gathered submissions from another 17 distinct teams in the following year. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. Our analysis revealed that the participants improved dice coefficient metrics from 0.8607 in 2020 to 0.8993 in 2021 despite adding diverse and challenging frames (containing irregular, smaller, sessile, or flat polyps), which are frequently missed during a routine clinical examination. For the instrument segmentation task, the best team obtained a mean Intersection over union metric of 0.9364. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models' credibility for clinical deployment. The best team obtained a final transparency score of 21 out of 25. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage subjective evaluation for building more transparent and understandable AI-based colonoscopy systems. Moreover, we discuss the need for multi-center and out-of-distribution testing to address the current limitations of the methods to reduce the cancer burden and improve patient care.
结肠镜图像的自动分析一直是一个活跃的研究领域,因为早期检测癌前息肉非常重要。然而,由于内镜医生技能和经验的差异、注意力不集中和疲劳等各种因素,在实际检查中检测息肉具有挑战性,这导致息肉的漏诊率很高。因此,需要一种能够在检查过程中标记遗漏息肉的自动化系统,从而改善患者的护理。深度学习作为一种有前途的解决方案,已经出现,因为它可以帮助内镜医生实时检测和分类被忽视的息肉和异常,提高诊断的准确性并增强治疗效果。除了算法的准确性之外,透明度和可解释性对于解释算法预测的原因和方式至关重要。此外,基于错误决策的结论可能是致命的,尤其是在医学领域。尽管存在这些缺陷,但大多数算法都是在私有数据、闭源或专有软件中开发的,并且方法缺乏可重复性。因此,为了促进高效透明方法的发展,我们组织了“Medico 自动息肉分割(Medico 2020)”和“MedAI:医学图像分割中的透明度(MedAI 2021)”竞赛。Medico 2020 挑战赛收到了 17 个团队的提交,而 MedAI 2021 挑战赛在第二年也收到了另外 17 个不同团队的提交。我们提供了一个全面的总结并分析了每个贡献,突出了表现最好的方法的优势,并讨论了将这些方法临床转化为临床实践的可能性。我们的分析表明,尽管在 2020 年增加了各种具有挑战性的帧(包含不规则、较小、无蒂或扁平的息肉),参与者将骰子系数指标从 2020 年的 0.8607 提高到了 2021 年的 0.8993,但这些帧在常规临床检查中经常被遗漏。对于仪器分割任务,最佳团队获得的平均交集比指标为 0.9364。对于透明度任务,一个多学科团队,包括专家胃肠病学家,访问了每个提交,并根据开源实践、失败案例分析、消融研究、评估的可用性和可理解性对团队进行评估,以更深入地了解模型在临床部署中的可信度。最佳团队获得了 25 分中的 21 分。通过对挑战赛的全面分析,我们不仅突出了息肉和手术仪器分割方面的进展,还鼓励进行主观评估,以构建更透明和易于理解的基于 AI 的结肠镜系统。此外,我们还讨论了需要进行多中心和分布外测试,以解决当前方法的局限性,从而降低癌症负担并改善患者护理。