Li Huo, Qin Jing, Li Zhongzhuan, Ouyang Rong, Chen Zhixin, Huang Shijiang, Qin Shufen, Huang Qiliang
Department of Gastroenterology, The Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, China.
Department of General Medicine, Liuzhou People's Hospital, Liuzhou, China.
NPJ Digit Med. 2025 Jul 18;8(1):456. doi: 10.1038/s41746-025-01848-z.
This meta-analysis evaluated diagnostic performance of deep learning (DL) algorithms using whole slide images (WSIs) for detecting microsatellite instability-high (MSI-H) in colorectal cancer (CRC). PubMed, Embase, and Web of Science were searched until January 2025. Nineteen studies comprising 33,383 samples were included. Bivariate random-effects models calculated pooled sensitivity/specificity with 95% CIs. The revised QUADAS-2 tool was used for quality assessment. Pooled patient-based internal validation showed a sensitivity of 0.88 and specificity of 0.86, while external validation revealed higher sensitivity of 0.93 but lower specificity of 0.71. Image-based analysis showed similar accuracy. Meta-regression identified center, reference standard, and tile size as major sources of heterogeneity, with no significant differences observed between internal and external performance. Overall, DL algorithms demonstrate excellent sensitivity in detecting MSI-H; however, their lower specificity in external validation suggests overfitting and highlights the need for algorithm standardization to improve generalizability and clinical utility.
这项荟萃分析评估了使用全切片图像(WSIs)的深度学习(DL)算法在检测结直肠癌(CRC)微卫星高度不稳定(MSI-H)方面的诊断性能。检索了PubMed、Embase和Web of Science直至2025年1月。纳入了19项研究,共33383个样本。双变量随机效应模型计算了合并敏感性/特异性及95%置信区间。使用修订后的QUADAS-2工具进行质量评估。基于患者的合并内部验证显示敏感性为0.88,特异性为0.86,而外部验证显示敏感性较高,为0.93,但特异性较低,为0.71。基于图像的分析显示出相似的准确性。Meta回归确定中心、参考标准和切片大小为异质性的主要来源,内部和外部性能之间未观察到显著差异。总体而言,DL算法在检测MSI-H方面表现出优异的敏感性;然而,它们在外部验证中的较低特异性表明存在过拟合问题,并突出了算法标准化以提高通用性和临床实用性的必要性。