Tiankanon Kasenee, Karuehardsuwan Julalak, Aniwan Satimai, Mekaroonkamol Parit, Sunthornwechapong Panukorn, Navadurong Huttakan, Tantitanawat Kittithat, Mekritthikrai Krittaya, Samutrangsi Salin, Vateekul Peerapon, Rerknimitr Rungsun
Division of Gastroenterology, Department of Medicine, Faculty of Medicine, Chulalongkorn University and King Chulalongkorn Memorial Hospital, Thai red cross, Bangkok.
Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand.
Clin Endosc. 2024 Mar;57(2):217-225. doi: 10.5946/ce.2023.145. Epub 2024 Feb 7.
BACKGROUND/AIMS: This study aims to compare polyp detection performance of "Deep-GI," a newly developed artificial intelligence (AI) model, to a previously validated AI model computer-aided polyp detection (CADe) using various false positive (FP) thresholds and determining the best threshold for each model. METHODS: Colonoscopy videos were collected prospectively and reviewed by three expert endoscopists (gold standard), trainees, CADe (CAD EYE; Fujifilm Corp.), and Deep-GI. Polyp detection sensitivity (PDS), polyp miss rates (PMR), and false-positive alarm rates (FPR) were compared among the three groups using different FP thresholds for the duration of bounding boxes appearing on the screen. RESULTS: In total, 170 colonoscopy videos were used in this study. Deep-GI showed the highest PDS (99.4% vs. 85.4% vs. 66.7%, p<0.01) and the lowest PMR (0.6% vs. 14.6% vs. 33.3%, p<0.01) when compared to CADe and trainees, respectively. Compared to CADe, Deep-GI demonstrated lower FPR at FP thresholds of ≥0.5 (12.1 vs. 22.4) and ≥1 second (4.4 vs. 6.8) (both p<0.05). However, when the threshold was raised to ≥1.5 seconds, the FPR became comparable (2 vs. 2.4, p=0.3), while the PMR increased from 2% to 10%. CONCLUSION: Compared to CADe, Deep-GI demonstrated a higher PDS with significantly lower FPR at ≥0.5- and ≥1-second thresholds. At the ≥1.5-second threshold, both systems showed comparable FPR with increased PMR.
背景/目的:本研究旨在比较新开发的人工智能(AI)模型“深度胃肠成像(Deep-GI)”与先前验证的AI模型计算机辅助息肉检测(CADe)在不同假阳性(FP)阈值下的息肉检测性能,并确定每个模型的最佳阈值。 方法:前瞻性收集结肠镜检查视频,并由三位专家内镜医师(金标准)、实习医生、CADe(CAD EYE;富士胶片公司)和Deep-GI进行评估。使用不同的FP阈值,比较三组在屏幕上出现边界框期间的息肉检测灵敏度(PDS)、息肉漏诊率(PMR)和假阳性警报率(FPR)。 结果:本研究共使用了170份结肠镜检查视频。与CADe和实习医生相比,Deep-GI分别显示出最高的PDS(99.4%对85.4%对66.7%,p<0.01)和最低的PMR(0.6%对14.6%对33.3%,p<0.01)。与CADe相比,Deep-GI在FP阈值≥0.5秒(12.1对22.4)和≥1秒(4.4对6.8)时显示出较低的FPR(均p<0.05)。然而,当阈值提高到≥1.5秒时,FPR变得相当(2对2.4,p=0.3),而PMR从2%增加到10%。 结论:与CADe相比,Deep-GI在≥0.5秒和≥1秒阈值时显示出更高的PDS和显著更低的FPR。在≥1.5秒阈值时,两个系统的FPR相当,但PMR增加。
J Gastroenterol Hepatol. 2024-8
J Gastroenterol Hepatol. 2024-5
World J Gastroenterol. 2021-8-21
Gastroenterology. 2020-8
CA Cancer J Clin. 2020-3-5
Lancet Gastroenterol Hepatol. 2020-1-22
Clin Gastroenterol Hepatol. 2020-7