Marya Neil, Powers Patrick, AbiMansour Jad P, Marcello Matthew, Thiruvengadam Nikhil, Nasser-Ghodsi Navine, Rau Prashanth, Zivny Jaroslav, Mehta Savant, Marshall Christopher, Leonor Paul, Che Kendrick, Abu Dayyeh Barham K, Storm Andrew C, Petersen Bret T, Law Ryan J, Martin John A, Vargas Eric J, Chandrasekhara Vinay
Gastroenterology, UMass Chan Medical School, Worcester, United States.
Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, United States.
Endoscopy. 2025 Jul 6. doi: 10.1055/a-2650-0789.
Clinicians struggle to accurately classify biliary strictures as benign or malignant. Current ERCP-based sampling modalities including brush cytology and forceps biopsy have poor sensitivity for pathologic confirmation of malignancy. Cholangioscopy allows for direct visualization and sampling of biliary pathology; however, this technology is also associated with inaccurate classification of biliary disease. Previously, an artificial intelligence (AI) that analyzes cholangioscopy footage was found to be more accurate in diagnosing biliary malignancy than ERCP sampling techniques. The aim of this study was to validate this AI on a new series of examinations.
Three academic centers collected all available, unedited cholangioscopy recordings. The videos were processed by the cholangioscopy AI. After analyzing videos, the AI provided predictions as to whether malignancy was present. AI performance in classifying strictures was compared to performance of brush cytology and forceps biopsy.
112 cholangioscopy examinations (containing 4,817,081 images) were generated from 99 patients. Of those examinations, 61 (54.5%) were for investigation of biliary strictures (31 [50.8%] benign, 30 [49.2%] malignant). For the correct classification of strictures, the AI was 80.0% sensitive and 90.3% specific. The AI was also significantly more accurate for stricture classification (85.2%) than brush cytology (52.5%; p < 0.001), forceps biopsy (68.2%; p = 0.037), and the combination of brush cytology and forceps biopsy (66.7%; p = 0.022).
A previously developed cholangioscopy AI was found to continually outperform standard ERCP sampling modalities for accurate identification of malignancy without additional retraining in a multicenter validation cohort.
临床医生在准确将胆管狭窄分类为良性或恶性方面面临困难。当前基于内镜逆行胰胆管造影(ERCP)的采样方式,包括刷检细胞学和钳取活检,对恶性肿瘤的病理确诊敏感性较差。胆管镜检查可直接观察胆管病变并进行采样;然而,这项技术在胆管疾病分类方面也存在不准确的情况。此前发现,一种分析胆管镜检查影像的人工智能(AI)在诊断胆管恶性肿瘤方面比ERCP采样技术更准确。本研究的目的是在一系列新的检查中验证这种AI。
三个学术中心收集了所有可用的、未经编辑的胆管镜检查记录。这些视频由胆管镜AI进行处理。在分析视频后,AI对是否存在恶性肿瘤提供预测。将AI在狭窄分类方面的表现与刷检细胞学和钳取活检的表现进行比较。
从99名患者中生成了112次胆管镜检查(包含4,817,081张图像)。在这些检查中,61次(54.5%)是用于调查胆管狭窄(31次[50.8%]为良性,30次[49.2%]为恶性)。对于狭窄的正确分类,AI的敏感性为80.0%,特异性为90.3%。AI在狭窄分类方面(85.2%)也比刷检细胞学(52.5%;p < 0.001)、钳取活检(68.2%;p = 0.037)以及刷检细胞学和钳取活检联合使用(66.7%;p = 0.022)显著更准确。
在一个多中心验证队列中,发现一种先前开发的胆管镜AI在准确识别恶性肿瘤方面持续优于标准的ERCP采样方式,且无需额外重新训练。