Gadi Sanjay R V, Mori Yuichi, Misawa Masashi, East James E, Hassan Cesare, Repici Alessandro, Byrne Michael F, von Renteln Daniel, Hewett David G, Wang Pu, Saito Yutaka, Matsubayashi Carolina Ogawa, Ahmad Omer F, Sharma Prateek, Gross Seth A, Sengupta Neil, Mansour Nabil, Cherubini Andrea, Dinh Nhan Ngo, Xiao Xiao, Mountney Peter, González-Bueno Puyal Juana, Little Greg, LaRocco Shawn, Conjeti Sailesh, Seibt Hannes, Zur Dror, Shimada Hitoshi, Berzin Tyler M, Glissen Brown Jeremy R
Department of Medicine, Duke University School of Medicine, Durham, North Carolina, USA.
Digestive Disease Center, Showa University Northern Yokohama Hospital, Yokohama, Japan; Clinical Effectiveness Research Group, Institute of Health and Society, University of Oslo, Oslo, Norway; Department of Transplantation Medicine, Oslo University Hospital, Oslo, Norway.
Gastrointest Endosc. 2025 Jul;102(1):109-116.e2. doi: 10.1016/j.gie.2024.11.042. Epub 2024 Nov 26.
Multiple computer-aided detection (CADe) software programs have now achieved regulatory approval in the United States, Europe, and Asia and are being used in routine clinical practice to support colorectal cancer screening. There is uncertainty regarding how different CADe algorithms may perform. No objective methodology exists for comparing different algorithms. We aimed to identify priority scoring metrics for CADe evaluation and comparison.
A modified Delphi approach was used. Twenty-five global leaders in CADe in colonoscopy, including endoscopists, researchers, and industry representatives, participated in an online survey over the course of 8 months. Participants generated 121 scoring criteria, 54 of which were deemed within the study scope and distributed for review and asynchronous e-mail-based open comment. Participants then scored criteria in order of priority on a 5-point Likert scale during ranking round 1. The top 11 highest priority criteria were re-distributed, with another opportunity for open comment, followed by a final round of priority scoring to identify the final 6 criteria.
Mean priority scores for the 54 criteria ranged from 2.25 to 4.38 after the first ranking round. The top 11 criteria after round 1 of ranking yielded mean priority scores ranging from 3.04 to 4.16. The final 6 highest priority criteria, including a tie for first-place ranking, were (1, tied) sensitivity (average, 4.16) and (1, tied) separate and independent validation of the CADe algorithm (average, 4.16); (3) adenoma detection rate (average, 4.08); (4) false-positive rate (average, 4.00); (5) latency (average, 3.84); and (6) adenoma miss rate (average, 3.68).
This is the first reported international consensus statement of priority scoring metrics for CADe in colonoscopy. These scoring criteria should inform CADe software development and refinement. Future research should validate these metrics on a benchmark video dataset to develop a validated scoring instrument.
多种计算机辅助检测(CADe)软件程序现已在美国、欧洲和亚洲获得监管批准,并正在常规临床实践中用于支持结直肠癌筛查。不同的CADe算法的表现如何尚不确定。不存在用于比较不同算法的客观方法。我们旨在确定用于CADe评估和比较的优先评分指标。
采用改良的德尔菲法。25位结肠镜检查CADe领域的全球领军人物,包括内镜医师、研究人员和行业代表,在8个月的时间里参与了一项在线调查。参与者提出了121条评分标准,其中54条被认为在研究范围内,并进行了分发以供审查和基于电子邮件的异步公开评论。然后,参与者在第一轮排序中按照优先级对标准进行5分制李克特量表评分。前11条优先级最高的标准被重新分发,再次给予公开评论的机会,随后进行最后一轮优先级评分以确定最终的6条标准。
在第一轮排序后,54条标准的平均优先级得分在2.25至4.38之间。第一轮排序后的前11条标准的平均优先级得分在3.04至4.16之间。最终的6条优先级最高的标准,包括并列第一名,分别是:(1,并列)敏感性(平均,4.16)和(1,并列)CADe算法的独立验证(平均,4.16);(3)腺瘤检出率(平均,4.08);(4)假阳性率(平均,4.00);(5)延迟时间(平均,3.84);以及(6)腺瘤漏诊率(平均,3.68)。
这是首次报道的关于结肠镜检查CADe优先评分指标的国际共识声明。这些评分标准应为CADe软件开发和完善提供参考。未来的研究应在基准视频数据集上验证这些指标,以开发一种经过验证的评分工具。