Tang Xueyao, Zhou Hong, Liu Ying, Gao Shan, Zhou Yang
Department of Ultrasound, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China.
Department of Geriatric, Hospital of Chengdu Office of People's Government of Tibetan Autonomous Region, Chengdu, Sichuan, China.
Sci Prog. 2025 Apr-Jun;108(2):368504251346906. doi: 10.1177/00368504251346906. Epub 2025 Jun 4.
BackgroundThe incidence of cervical lymph node metastasis (CLNM) in thyroid cancer (TC) is high. Accurate preoperative diagnosis of CLNM is critical to reduce unnecessary lymph node dissection and complications for TC patients. Ultrasound (US)-based artificial intelligence (AI) systems show promise for CLNM prediction, but their diagnostic performance requires systematic evaluation.MethodsA comprehensive search of four electronic databases (Web of Science, Embase, PubMed, and Cochrane Library) was conducted from inception to 30 December 2023. The random-effects model was chosen to calculate the pooled diagnostic indicators. Sensitivity analysis and heterogeneity test were conducted.ResultsAmong 19 included studies, the AI system demonstrated pooled sensitivity, specificity, area under the curve (AUC) were 0.76 (95% condidence interval (CI): 0.71-0.80), 0.78 (95% CI: 0.74-0.82), and 0.84 (95% CI: 0.15-0.99), respectively. The sensitivity, specificity and AUC in clinically node-negative (cN0) patients were 0.73 (95% CI: 0.68-0.77), 0.81 (95% CI: 0.76-0.85) and 0.83 (95% CI: 0.14-0.99). The sensitivity, specificity and AUC for the central CLNM were 0.73 (95% CI: 0.69-0.77), 0.77 (95% CI: 0.72-0.81) and 0.81 (95% CI: 0.14-0.99). Multi-center designed studies yielded higher sensitivity (0.79 vs. 0.75, < 0.01) and specificity (0.79 vs. 0.78, < 0.01) than single-center designs. Deep learning (DL) yielded higher sensitivity (0.79 vs. 0.74, < 0.01) and specificity (0.83 vs. 0.75, < 0.01) than classic machine learning. Studies published after 2022 yielded higher sensitivity (0.77 vs. 0.74, < 0.01) than before 2022. Studies from China had lower specificity than studies from other countries (0.78 vs. 0.80, = 0.01). Models incorporating multimodal features outperformed unimodal US (specificity: 0.79 vs. 0.75, < 0.01).ConclusionUS-based AI systems exhibit favorable predictive value for CLNM in TC, particularly with DL and multimodal designs, potentially reducing overtreatment. Prospective validation is needed prior to clinical adoption.
背景
甲状腺癌(TC)中颈淋巴结转移(CLNM)的发生率很高。准确的术前CLNM诊断对于减少TC患者不必要的淋巴结清扫和并发症至关重要。基于超声(US)的人工智能(AI)系统在CLNM预测方面显示出前景,但其诊断性能需要系统评估。
方法
对四个电子数据库(科学网、Embase、PubMed和Cochrane图书馆)从创建到2023年12月30日进行全面检索。选择随机效应模型来计算合并诊断指标。进行敏感性分析和异质性检验。
结果
在纳入的19项研究中,AI系统显示合并敏感性、特异性、曲线下面积(AUC)分别为0.76(95%置信区间(CI):0.71 - 0.80)、0.78(95%CI:0.74 - 0.82)和0.84(95%CI:0.15 - 0.99)。临床淋巴结阴性(cN0)患者的敏感性、特异性和AUC分别为0.73(95%CI:0.68 - 0.77)、0.81(95%CI:0.76 - 0.85)和0.83(95%CI:0.14 - 0.99)。中央CLNM的敏感性、特异性和AUC分别为0.73(95%CI:0.69 - 0.77)、0.77(95%CI:0.72 - 0.81)和0.81(95%CI:0.14 - 0.99)。多中心设计的研究比单中心设计具有更高的敏感性(0.79对0.75,<0.01)和特异性(0.79对0.78,<0.01)。深度学习(DL)比经典机器学习具有更高的敏感性(0.79对0.74,<0.01)和特异性(0.83对0.75,<0.01)。2022年后发表的研究比2022年前具有更高的敏感性(0.77对0.74,<0.01)。来自中国的研究比其他国家的研究特异性更低(0.78对0.80,=0.01)。纳入多模态特征的模型优于单模态超声(特异性:0.79对0.75,<0.01)。
结论
基于超声的AI系统在TC的CLNM预测中表现出良好的预测价值,特别是采用DL和多模态设计时,可能减少过度治疗。在临床应用前需要进行前瞻性验证。